View MILITARYINTEL386_3268413.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

october 1993 order number: 271052-009 military intel386 tm high performance 32-bit microprocessor with integrated memory management y flexible 32-bit microprocessor e 8, 16, 32-bit data types e 8 general purpose 32-bit registers y very large address space e 4 gigabyte physical e 64 terabyte virtual e 4 gigabyte maximum segment size y integrated memory management unit e virtual memory support e optional on-chip paging e 4 levels of protection e fully compatible with m80286 y object code compatible with all m8086 family microprocessors y high speed numerics support via intel387 tm coprocessor y hardware debugging support y virtual m8086 mode allows running of m8086 software in a protected and paged system y optimized for system performance e pipelined instruction execution e on-chip address translation caches e 32 megabytes/sec bus bandwidth y complete system development support e software: c, pl/m, assembler system generation tools e debuggers: pscope, ice tm -intel386 y high speed chmos iv technology y 132-lead pin grid array package and 164-lead ceramic quad flatpack package (see packaging specification, order y 231369) y available in three product grades: e mil-std-883, b 55 cto a 125 c(t c ) e military temperature only, b 55 cto a 125 c(t c ) e extended temperature, b 40 cto a 110 c(t c ) the military intel386 microprocessor is an advanced 32-bit component designed for applications needing very high performance and optimized for multitasking operating systems. the 32-bit registers and data paths support 32-bit addresses and data types. the processor addresses up to four gigabytes of physical memory and 64 terabytes (2 ** 46) of virtual memory. the integrated memory management and protection architecture includes address translation registers, advanced multitasking hardware and a protection mechanism to sup- port operating systems. in addition, the military intel386 microprocessor allows the simultaneous running of multiple operating systems. instruction pipelining, on-chip address translation, and high bus bandwidth ensure short average instruction execution times and high system throughput. the military intel386 processor is capable of execution at sus- tained rates of between 3 and 4 million instructions per second. the military intel386 processor offers new testability and debugging features. testability features include a self-test and direct access to the page translation cache. four new breakpoint registers provide breakpoint traps on code execution or data accesses, for powerful debugging of even rom-based systems. object-code compatibility with all m8086 family members (m8086, m8088, m80186, m80286) means the military intel386 processor offers immediate access to the world's largest microprocessor software base. unix is a trademark of at&t bell labs. ms-dos is a trademark of microsoft corporation. intel386 is a trademark of intel corporation.
military intel386 tm high performance 32-bit microprocessor with integrated memory management contents page 1.0 base architecture 9 1.1 introduction 9 1.2 register overview 9 1.3 register descriptions 10 1.3.1 general purpose registers 10 1.3.2 instruction pointer 10 1.3.3 flags register 10 1.3.4 segment registers 12 1.3.5 segment descriptor registers 13 1.3.6 control registers 13 1.3.7 system address registers 14 1.3.8 debug and test registers 15 1.3.9 register accessibility 15 1.3.10 compatibility 15 1.4 instruction set 16 1.4.1 instruction set overview 16 1.4.2 military intel386 tm microprocessor instructions 17 1.5 addressing modes 19 1.5.1 addressing modes overview 19 1.5.2 register and immediate modes 19 1.5.3 32-bit memory addressing modes 19 1.5.4 differences between 16- and 32-bit addresses 20 1.6 data types 21 1.7 memory organization 23 1.7.1 introduction 23 1.7.2 address spaces 23 1.7.3 segment register usage 24 1.8 i/o space 24 1.9 interrupts 25 1.9.1 interrupts and exceptions 25 1.9.2 interrupt processing 25 1.9.3 maskable interrupt 25 1.9.4 non-maskable interrupt 26 1.9.5 software interrupts 26 1.9.6 interrupt and exception priorities 27 1.9.7 instruction restart 28 1.9.8 double faults 28 1.10 reset and initialization 28 2
contents page 1.0 base architecture (continued) 1.11 testability 29 1.11.1 self-test 29 1.11.2 tlb testing 29 1.12 debugging support 29 1.12.1 breakpoint instruction 30 1.12.2 single-step trap 30 1.12.3 debug registers 30 1.12.3.1 linear address breakpoint registers (dr0 dr3) 30 1.12.3.2 debug control register (dr7) 30 1.12.3.3 debug status register (dr6) 33 1.12.3.4 use of resume flag (rf) in flag register 33 2.0 real mode architecture 33 2.1 real mode instruction 33 2.2 memory addressing 34 2.3 reserved locations 35 2.4 interrupts 35 2.5 shutdown and halt 35 3.0 protected mode architecture 35 3.1 introduction 35 3.2 addressing mechanism 36 3.3 segmentation 37 3.3.1 segmentation introduction 37 3.3.2 terminology 37 3.3.3 descriptor tables 37 3.3.3.1 descriptor tables introduction 37 3.3.3.2 global descriptor table 38 3.3.3.3 local descriptor table 38 3.3.3.4 interrupt descriptor table 38 3.3.4 descriptors 38 3.3.4.1 descriptor attribute bits 38 3.3.4.2 intel386 tm code, data descriptors (s e 1) 39 3.3.4.3 system descriptor formats 40 3.3.4.4 ldt descriptors (s e 0, type e 2) 41 3.3.4.5 tss descriptors (s e 0, type e 1, 3, 9, b) 41 3.3.4.6 gate descriptors (s e 0, type e 47,c,f) 41 3.3.4.7 differences between military intel386 tm microprocessor and 286 descriptors 42 3.3.4.8 selector fields 42 3.3.4.9 segment descriptor cache 42 3.3.4.10 segment descriptor register settings 44 3
contents page 3.0 protected mode architecture (continued) 3.4 protection 47 3.4.1 protection concepts 47 3.4.2 rules of privilege 47 3.4.3 privilege levels 47 3.4.3.1 task privilege 47 3.4.3.2 selector privilege (rpl) 47 3.4.3.3 i/o privilege level and i/o permission bitmap 47 3.4.3.4 privilege validation 48 3.4.3.5 descriptor access 48 3.4.4 privilege level transfers 48 3.4.5 call gates 51 3.4.6 task switching 51 3.4.7 initialization and transition to protected mode 52 3.4.8 tools for building protected systems 53 3.5 paging 53 3.5.1 paging concepts 53 3.5.2 paging organization 54 3.5.2.1 page mechanism 54 3.5.2.2 page descriptor base register 54 3.5.2.3 page directory 54 3.5.2.4 page tables 55 3.5.2.5 page directory/table entries 55 3.5.3 page level protection (r/w, u/s bits) 55 3.5.4 translation lookaside buffer 56 3.5.5 paging operation 56 3.5.6 operating system responsibilities 57 3.6 virtual m8086 environment 57 3.6.1 executing m8086 programs 57 3.6.2 virtual m8086 mode addressing mechanism 57 3.6.3 paging in virtual mode 57 3.6.4 protection and i/o permission bitmap 58 3.6.5 interrupt handling 59 3.6.6 entering and leaving virtual m8086 mode 59 3.6.6.1 task switches to/from virtual m8086 mode 60 3.6.6.2 transitions through trap and interrupt gates, and iret 60 4
contents page 4.0 functional data 62 4.1 introduction 62 4.2 signal description 62 4.2.1 introduction 62 4.2.2 clock (clk2) 62 4.2.3 data bus (d0 through d31) 63 4.2.4 address bus (be0 through be3 , a2 through a31) 63 4.2.5 bus cycle definition signals (w/r , d/c , m/io , lock ) 64 4.2.6 bus control signals 65 4.2.6.1 introduction 65 4.2.6.2 address status (ads ) 65 4.2.6.3 transfer acknowledge (ready ) 65 4.2.6.4 next address request (na ) 65 4.2.6.5 bus size 16 (bs16 ) 65 4.2.7 bus arbitration signals 66 4.2.7.1 introduction 66 4.2.7.2 bus hold request (hold) 66 4.2.7.3 bus hold acknowledge (hlda) 66 4.2.8 coprocessor interface signals 66 4.2.8.1 introduction 66 4.2.8.2 coprocessor request (pereq) 66 4.2.8.3 coprocessor busy (busy ) 66 4.2.8.4 coprocessor error (error ) 67 4.2.9 interrupt signals 67 4.2.9.1 introduction 67 4.2.9.2 maskable interrupt request (intr) 67 4.2.9.3 non-maskable interrupt request (nmi) 67 4.2.9.4 reset (reset) 67 4.2.10 signal summary 68 4.3 bus transfer mechanism 68 4.3.1 introduction 68 4.3.2 memory and i/o spaces 69 4.3.3 memory and i/o organization 70 4.3.4 dynamic data bus sizing 70 4.3.5 interfacing with 32- and 16-bit memories 71 4.3.6 operand alignment 72 5
contents page 4.0 functional data (continued) 4.4 bus functional description 72 4.4.1 introduction 72 4.4.2 address pipelining 75 4.4.3 read and write cycles 77 4.4.3.1 introduction 77 4.4.3.2 non-pipelined address 78 4.4.3.3 non-pipelined address with dynamic data bus sizing 80 4.4.3.4 pipelined address 82 4.4.3.5 initiating and maintaining pipelined address 84 4.4.3.6 pipelined address with dynamic data bus sizing 86 4.4.4 interrupt acknowledge (inta) cycles 88 4.4.5 halt indication cycle 89 4.4.6 shutdown indication cycle 90 4.5 other functional descriptions 91 4.5.1 entering and exiting hold acknowledge 91 4.5.2 reset during hold acknowledge 91 4.5.3 bus activity during and following reset 91 4.6 self-test signature 93 4.7 component and revision identifiers 93 4.8 coprocessor interfacing 95 4.8.1 software testing for coprocessor presence 95 5.0 mechanical data 96 5.1 introduction 96 5.2 pin assignment 96 6.0 electrical data 102 6.1 introduction 102 6.2 power and grounding 102 6.2.1 power connections 102 6.2.2 power decoupling recommendations 102 6.2.3 resistor recommendations 102 6.2.4 other connection recommendations 102 6.3 maximum ratings 103 6.4 operating conditions 103 6.5 dc specifications 104 6.6 ac specifications 105 6.6.1 ac specification definitions 105 6.6.2 ac specification tables 106 6.6.3 ac test loads 107 6.6.4 ac timing waveforms 107 6
contents page 6.0 electrical data (continued) 6.7 designing for ice tm -386 use 110 7.0 instruction set 112 7.1 military intel386 tm processor instruction encoding and clock count summary 112 7.2 instruction encoding 127 7.2.1 overview 127 7.2.2 32-bit extensions of the instruction set 128 7.2.3 encoding of the instruction fields 128 7.2.3.1 encoding of operand length (w) field 128 7.2.3.2 encoding of the general register (reg) field 128 7.2.3.3 encoding of the segment register (sreg) field 129 7.2.3.4 encoding of address mode 129 7.2.3.5 encoding of operation direction (d) field 133 7.2.3.6 encoding of sign-extend (s) field 133 7.2.3.7 encoding of conditional test (tttn) field 133 7.2.3.8 encoding of control or debug or test register (eee) field 133 7
military intel386 tm microprocessor figure 1-1. military intel386 tm processor pipelined 32-bit microarchitecture 271052 49 8
military intel386 tm microprocessor 1.0 base architecture 1.1 introduction the military intel386 microprocessor consists of a central processing unit, a memory management unit and a bus interface. the central processing unit consists of the execu- tion unit and instruction unit. the execution unit con- tains the eight 32-bit general purpose registers which are used for both address calculation, data operations and a 64-bit barrel shifter used to speed shift, rotate, multiply, and divide operations. the multiply and divide logic uses a 1-bit per cycle algo- rithm. the multiply algorithm stops the iteration when the most significant bits of the multiplier are all zero. this allows typical 32-bit multiplies to be exe- cuted in under one microsecond. the instruction unit decodes the instruction opcodes and stores them in the decoded instruction queue for immediate use by the execution unit. the memory management unit (mmu) consists of a segmentation unit and a paging unit. segmentation allows the managing of the logical address space by providing an extra addressing component, one that allows easy code and data relocatability, and effi- cient sharing. the paging mechanism operates be- neath and is transparent to the segmentation pro- cess, to allow management of the physical address space. each segment is divided into one or more 4 kbyte pages. to implement a virtual memory sys- tem, the military intel386 microprocessor supports full restartability for all page and segment faults. memory is organized into one or more variable length segments, each up to four gigabytes in size. a given region of the linear address space, a segment, can have attributes associated with it. these attri- butes include its location, size, type (i.e. stack, code or data), and protection characteristics. each task on a military intel386 microprocessor can have a maximum of 16,381 segments of up to four giga- bytes each, thus providing 64 terabytes (trillion bytes) of virtual memory to each task. the segmentation unit provides four-levels of pro- tection for isolating and protecting applications and the operating system from each other. the hardware enforced protection allows the design of systems with a high degree of integrity. the military intel386 microprocessor has two modes of operation: real address mode (real mode), and protected virtual address mode (protected mode). in real mode the military intel386 microprocessor operates as a very fast m8086, but with 32-bit exten- sions if desired. real mode is required primarily to setup the processor for protected mode operation. protected mode provides access to the sophisticat- ed memory management, paging and privilege capa- bilities of the processor. within protected mode, software can perform a task switch to enter into tasks designated as virtual m8086 mode tasks. each such task behaves with m8086 semantics, thus allowing m8086 software (an application program, or an entire operating system) to execute. the virtual m8086 tasks can be isolated and protected from one another and the host military intel386 microprocessor operating system, by the use of paging, and the i/o permission bitmap. finally, to facilitate high performance system hard- ware designs, the military intel386 microprocessor bus interface offers address pipelining, dynamic data bus sizing, and direct byte enable signals for each byte of the data bus. these hardware features are described fully beginning in section 4. 1.2 register overview the military intel386 processor has 32 register re- sources in the following categories: # general purpose registers # segment registers # instruction pointer and flags # control registers # system address registers # debug registers # test registers. the registers are a superset of the m8086, m80186 and m80286 registers, so all 16-bit m8086, m80186 and m80286 registers are contained within the 32-bit military intel386 microprocessor. figure 2-1 shows all of military intel386 microproc- essor base architecture registers, which include the general address and data registers, the instruction pointer, and the flags register. the contents of these registers are task-specific, so these registers are au- tomatically loaded with a new context upon a task switch operation. the base architecture also includes six directly ac- cessible segments, each up to 4 gbytes in size. the segments are indicated by the selector values placed in military intel386 microprocessor segment registers of figure 2-1. various selector values can be loaded as a program executes, if desired. 9
military intel386 tm microprocessor general data and address registers 31 16 15 0 ax eax bx ebx cx ecx dx edx si esi di edi bp ebp sp esp segment selector registers 15 0 cs code ss stack ds es data fs gs * instruction pointer and flags register 31 16 15 0 ip eip flags eflags figure 2-1. military intel386 microprocessor base architecture registers the selectors are also task-specific, so the segment registers are automatically loaded with new context upon a task switch operation. the other types of registers, control, system ad- dress, debug, and test, are primarily used by sys- tem software. 1.3 register descriptions 1.3.1 general purpose registers general purpose registers: the eight general pur- pose registers of 32 bits hold data or address quanti- ties. the general registers, figure 2-2, support data operands of 1, 8, 16, 32 and 64 bits, and bit fields of 1 to 32 bits. they support address operands of 16 and 32 bits. the 32-bit registers are named eax, ebx, ecx, edx, esi, edi, ebp, and esp. the least significant 16 bits of the registers can be accessed separately. this is done by using the 16-bit names of the registers ax, bx, cx, dx, si, di, bp, and sp. when accessed as a 16-bit operand, the upper 16-bits of the register are neither used nor changed. finally 8-bit operations can individually access the lowest byte (bits 0 7) and the higher byte (bits 8 15) of general purpose registers ax, bx, cx and dx. the lowest bytes are named al, bl, cl and dl, respectively. the higher bytes are named ah, bh, ch and dh, respectively. the individual byte acces- sibility offers additional flexibility for data operations, but is not used for effective address calculation. 31 16 15 8 7 0 ah a x al eax bh b x bl ebx ch c x cl ecx dh d x dl edx si esi di edi bp ebp sp esp 31 16 15 0 eip x ? y ip figure 2-2. general registers and instruction pointer 1.3.2 instruction pointer the instruction pointer, figure 2-2, is a 32-bit regis- ter named eip. eip holds the offset of the next instruction to be executed. the offset is always rela- tive to the base of the code segment (cs). the low- er 16 bits (bits 0 15) of eip contain the 16-bit in- struction pointer named ip, which is used by 16-bit addressing. 1.3.3 flags register the flags register is a 32-bit register named eflags. the defined bits and bit fields within eflags, shown in figure 2-3, control certain opera- tions and indicate status of the military intel386 mi- croprocessor. the lower 16 bits (bit 0 15) of eflags contain the 16-bit flag register named flags, which is most useful when executing m8086 and m80286 code. 10
military intel386 tm microprocessor 271052 50 figure 2-3. flags register vm (virtual m8086 mode, bit 17) the vm bit provides virtual m8086 mode within protected mode. if set while the mili- tary intel386 microprocessor is in protected mode, the military intel386 microprocessor will switch to virtual m8086 operation, han- dling segment loads as the m8086 does, but generating exception 13 faults on privileged opcodes. the vm bit can be set only in pro- tected mode, by the iret instruction (if cur- rent privilege level e 0) and by task switches at any privilege level. the vm bit is unaffect- ed by popf. pushf always pushe sa0in this bit, even if executing in virtual m8086 mode. the eflags image pushed during in- terrupt processing or saved during task switches will contai na1in this bit if the inter- rupted code was executing as a virtual m8086 task. rf (resume flag, bit 16) the rf flag is used in conjunction with the debug register breakpoints. it is checked at instruction boundaries before breakpoint pro- cessing. when rf is set, it causes any debug fault to be ignored on the next instruction. rf is then automatically reset at the successful completion of every instruction (no faults are signalled) except the iret instruction, the popf instruction, (and jmp, call, and int instructions causing a task switch). these in- structions set rf to the value specified by the memory image. for example, at the end of the breakpoint service routine, the iret instruction can pop an eflag image having the rf bit set and resume the program's exe- cution at the breakpoint address without gen- erating another breakpoint fault on the same location. nt (nested task, bit 14) this flag applies to protected mode. nt is set to indicate that the execution of this task is nested within another task. if set, it indicates that the current nested task's task state segment (tss) has a valid back link to the previous task's tss. this bit is set or reset by control transfers to other tasks. the value of nt in eflags is tested by the iret instruc- tion to determine whether to do an inter-task return or an intra-task return. a popf or an iret instruction will affect the setting of this bit according to the image popped, at any privilege level. iopl (input/output privilege level, bits 12-13) this two-bit field applies to protected mode. iopl indicates the numerically maximum cpl (current privilege level) value permitted to ex- ecute i/o instructions without generating an exception 13 fault or consulting the i/o per- mission bitmap. it also indicates the maxi- mum cpl value allowing alteration of the if (intr enable flag) bit when new values are popped into the eflag register. popf and iret instruction can alter the iopl field when executed at cpl e 0. task switches can al- ways alter the iopl field, when the new flag image is loaded from the incoming task's tss. 11
military intel386 tm microprocessor of (overflow flag, bit 11) of is set if the operation resulted in a signed overflow. signed overflow occurs when the operation resulted in carry/borrow into the sign bit (high-order bit) of the result but did not result in a carry/borrow out of the high- order bit, or vice-versa. for 8/16/32 bit oper- ations, of is set according to overflow at bit 7/15/31, respectively. df (direction flag, bit 10) df defines whether esi and/or edi registers postdecrement or postincrement during the string instructions. postincrement occurs if df is reset. postdecrement occurs if df is set. if (intr enable flag, bit 9) the if flag, when set, allows recognition of external interrupts signalled on the intr pin. when if is reset, external interrupts signalled on the intr are not recognized. iopl indi- cates the maximum cpl value allowing alter- ation of the if bit when new values are popped into eflags or flags. tf (trap enable flag, bit 8) tf controls the generation of exception 1 trap when single-stepping through code. when tf is set, the military intel386 proces- sor generates an exception 1 trap after the next instruction is executed. when tf is re- set, exception 1 traps occur only as a func- tion of the breakpoint addresses loaded into debug registers dr0 dr3. sf (sign flag, bit 7) sf is set if the high-order bit of the result is set, it is reset otherwise. for 8-, 16-, 32-bit operations, sf reflects the state of bit 7, 15, 31 respectively. zf (zero flag, bit 6) zf is set if all bits of the result are 0. other- wise it is reset. af (auxiliary carry flag, bit 4) the auxiliary flag is used to simplify the addi- tion and subtraction of packed bcd quanti- ties. af is set if the operation resulted in a carry out of bit 3 (addition) or a borrow into bit 3 (subtraction). otherwise af is reset. af is affected by carry out of, or borrow into bit 3 only, regardless of overall operand length: 8, 16 or 32 bits. pf (parity flags, bit 2) pf is set if the low-order eight bits of the op- eration contains an even number of ``1's'' (even parity). pf is reset if the low-order eight bits have odd parity. pf is a function of only the low-order eight bits, regardless of oper- and size. cf (carry flag, bit 0) cf is set if the operation resulted in a carry out of (addition), or a borrow into (subtraction) the high-order bit. otherwise cf is reset. for 8-, 16- or 32-bit operations, cf is set accord- ing to carry/borrow at bit 7, 15 or 31, respec- tively. note in these descriptions, ``set'' means ``set to 1,'' and ``reset'' means ``reset to 0.'' 1.3.4 segment registers six 16-bit segment registers hold segment selector values identifying the currently addressable memory segments. segment registers are shown in figure 2- 4. in protected mode, each segment may range in size from one byte up to the entire linear and physi- segment registers descriptor registers (loaded automatically) v a wv a w other segment 15 0 physical base address segment limit attributes from descriptor selector cs e selector ss e e selector ds e e e selector es e e e selector fs e e e selector gs e e e figure 2-4. military intel386 tm microprocessor segment registers, and associated descriptor registers 12
military intel386 tm microprocessor cal space of the machine, 4 gbytes (2 32 bytes). if a maximum-sized segment is used (limit e ffffffffh) it should be dword aligned (i.e., the least two significant bits of the segment base should be zero). this will avoid a segment limit violation (ex- ception 13) caused by the wraparound. in real ad- dress mode, the maximum segment size is fixed at 64 kbytes (2 16 bytes). the six segments addressable at any given moment are defined by the segment registers cs, ss, ds, es, fs and gs. the selector in cs indicates the current code segment; the selector in ss indicates the current stack segment; the selectors in ds, es, fs and gs indicate the current data segments. 1.3.5 segment descriptor registers the segment descriptor registers are not program- mer visible, yet it is very useful to understand their content. inside the military intel386 processor, a de- scriptor register (programmer invisible) is associated with each programmer-visible segment register, as shown by figure 2-4. each descriptor register holds a 32-bit segment base address, a 32-bit segment limit, and the other necessary segment attributes. when a selector value is loaded into a segment reg- ister, the associated descriptor register is automati- cally updated with the correct information. in real address mode, only the base address is updated directly (by shifting the selector value four bits to the left), since the segment maximum limit and attributes are fixed in real mode. in protected mode, the base address, the limit, and the attributes are all updated per the contents of the segment descriptor indexed by the selector. whenever a memory reference occurs, the segment descriptor register associated with the segment be- ing used is automatically involved with the memory reference. the 32-bit segment base address be- comes a component of the linear address calcula- tion, the 32-bit limit is used for the limit-check opera- tion, and the attributes are checked against the type of memory reference requested. 1.3.6 control registers the military intel386 microprocessor has three con- trol registers of 32 bits, cr0, cr2 and cr3, to hold machine state of a global nature (not specific to an individual task). these registers, along with system address registers described in the next section, hold machine state that affects all tasks in the sys- tem. to access the control registers, load and store instructions are defined. cr0: machine control register (includes m80286 machine status word) cr0, shown in figure 2-5, contains 6 defined bits for control and status purposes. the low-order 16 bits of cr0 are also known as the machine status word, msw, for compatibility with m80286 protected mode. lmsw and smsw instructions are taken as special aliases of the load and store cr0 opera- tions, where only the low-order 16 bits of cr0 are involved. for compatibility with m80286 operating systems the military intel386 processor's lmsw in- structions work in an identical fashion to the lmsw instruction on the m80286. (i.e. it only operates on the low-order 16-bits of cr0 and it ignores the new bits in cr0.) new military intel386 processor operat- ing systems should use the mov cr0, reg instruc- tion. the defined cr0 bits are described below. pg (paging enable, bit 31) the pg bit is set to enable the on-chip paging unit. it is reset to disable the on-chip paging unit. et (processor extension type, bit 4) et indicates the processor extension type (ei- ther m80287 or m387 coprocessor) as detect- ed by the level of the error input following m80386 reset. the et bit may also be set or reset by loading cr0 under program control if desired. if et is set, the m387 npx's compati- ble 32-bit protocol is used. if et is reset, m80287-compatible 16-bit protocol is used. note that for strict m80286 compatibility, et is not affected by the lmsw instruction. when the msw or cr0 is stored, bit 4 accurately re- flects the current state of the et bit. 31 24 23 16 15 8 7 0 p 00000000000000000000000000 etemp cr0 g tsmpe x ? y msw note: 0 indicates intel reserved: do not define; see section 2.3.10 figure 2-5. control register 0 13
military intel386 tm microprocessor ts (task switched, bit 3) ts is automatically set whenever a task switch operation is performed. if ts is set, a coproces- sor escape opcode will cause a coprocessor not available trap (exception 7). the trap han- dler typically saves the m80287/m387 npx context belonging to a previous task, loads the m80287/m387 npx state belonging to the cur- rent task, and clears the ts bit before returning to the faulting coprocessor opcode. em (emulate coprocessor, bit 2) the emulate coprocessor bit is set to cause all coprocessor opcodes to generate a coproces- sor not available fault (exception 7). it is reset to allow coprocessor opcodes to be executed on an actual m80287 or m387 coprocessor (this the default case after reset). note that the wait opcode is not affected by the em bit set- ting. mp (monitor coprocessor, bit 1) the mp bit is used in conjunction with the ts bit to determine if the wait opcode will gener- ate a coprocessor not available fault (excep- tion 7) when ts e 1. when both mp e 1 and ts e 1, the wait opcode generates a trap. otherwise, the wait opcode does not gener- ate a trap. note that ts is automatically set whenever a task switch operation is performed. pe (protection enable, bit 0) the pe bit is set to enable the protected mode. if pe is reset, the processor operates again in real mode. pe may be set by loading msw or cr0. pe can be reset only by a load into cr0. resetting the pe bit is typically part of a longer instruction sequence needed for proper tran- sition from protected mode to real mode. note that for strict m80286 compatibility, pe cannot be reset by the lmsw instruction. cr1: reserved cr1 is reserved for use in future intel processors. cr2: page fault linear address cr2, shown in figure 2-6, holds the 32-bit linear ad- dress that caused the last page fault detected. the error code pushed onto the page fault handler's stack when it is invoked provides additional status information on this page fault. cr3: page directory base address cr3, shown in figure 2-6, contains the physical ad- dress of the page directory table. the military intel386 processor page directory table is always page-aligned (4 kbyte-aligned). therefore the low- est twelve bits of cr3 are ignored when written and they store as undefined. a task switch through a tss which changes the value in cr3, or an explicit load into cr3 with any value, will invalidate all cached page table entries in the paging unit cache. note that if the value in cr3 does not change during the task switch, the cached page table entries are not flushed. 1.3.7 system address registers four special registers are defined to reference the tables or segments supported by the m80286/mili- tary intel386 microprocessor protection model. these tables or segments are: gdt (global descriptor table), idt (interrupt descriptor table), ldt (local descriptor table), tss (task state segment). the addresses of these tables and segments are stored in special registers, the system address and system segment registers illustrated in figure 2-7. these registers are named gdtr, idtr, ldtr and tr, respectively. section 3 protected mode archi- tecture describes the use of these registers. gdtr and idtr these registers hold the 32-bit linear base address and 16-bit limit of the gdt and idt, respectively. the gdt and idt segments, since they are global to all tasks in the system, are defined by 32-bit linear addresses (subject to page translation if paging is enabled) and 16-bit limit values. 31 24 23 16 15 8 7 0 page fault linear address register cr2 page directory base register 0 0 0 0 0 0 0 0 0 0 0 0 cr3 note: 0 indicates intel reserved: do not define; see section 2.3.10 figure 2-6. control registers 2 and 3 14
military intel386 tm microprocessor system address registers 47 32-bit linear base address 16 15 limit 0 gdtr idtr system segment registers descriptor registers (automatically loaded) v a wv a w 15 0 32-bit linear base address 32-bit segment limit attributes tr selector ldtr selector figure 2-7. system address and system segment registers ldtr and tr these registers hold the 16-bit selector for the ldt descriptor and the tss descriptor, respectively. the ldt and tss segments, since they are task- specific segments, are defined by selector values stored in the system segment registers. note that a segment descriptor register (programmer-invisible) is associated with each system segment register. 1.3.8 debug and test registers debug registers: the six programmer accessible debug registers provide on-chip support for debug- ging. debug registers dr0 3 specify the four linear breakpoints. the debug control register dr7 is used to set the breakpoints and the debug status register dr6, displays the current state of the breakpoints. the use of the debug registers is de- scribed in section 1.12 debugging support. debug registers 31 0 linear breakpoint address 0 dr0 linear breakpoint address 1 dr1 linear breakpoint address 2 dr2 linear breakpoint address 3 dr3 intel reserved. do not define. dr4 intel reserved. do not define. dr5 breakpoint status dr6 breakpoint control dr7 test registers (for page cache) 31 0 test control tr6 test status tr7 figure 2-8. debug and test registers test registers: two registers are used to control the testing of the ram/cam (content addressable memories) in the translation lookaside buffer por- tion of the military intel386 microprocessor. tr6 is the command test register, and tr7 is the data register which contains the data of the translation lookaside buffer test. their use is discussed in sec- tion 1.11 testability. figure 2-8 shows the debug and test registers. 1.3.9 register accessibility there are a few differences regarding the accessibil- ity of the registers in real and protected mode. ta- ble 2-1 summarizes these differences. see section 3 protected mode architecture for further details. 1.3.10 compatibility very important note: compatibility with future processors in the preceding register descriptions, note cer- tain military intel386 processor register bits are undefined. when undefined bits are called out, treat them as fully undefined. this is essential for your software compatibility with future proc- essors! follow the guidelines below: 1) do not depend on the states of any unde- fined bits when testing the values of defined register bits. mask them out when testing. 2) do not depend on the states of any unde- fined bits when storing them to memory or another register. 3) do not depend on the ability to retain infor- mation written into any undefined bits. 4) when loading registers always load the unde- fined bits as zeros. 15
military intel386 tm microprocessor table 2-1. register usage use in use in use in register real mode protected mode virtual m8086 mode load store load store load store general registers yes yes yes yes yes yes segment registers yes yes yes yes yes yes flag register yes yes yes yes iopl iopl * control registers yes yes pl e 0pl e 0 no yes gdtr yes yes pl e 0 yes no yes idtr yes yes pl e 0 yes no yes ldtr no no pl e 0 yes no no tr no no pl e 0 yes no no debug control yes yes pl e 0pl e 0no no test registers yes yes pl e 0pl e 0no no notes: pl e 0: the registers can be accessed only when the current privilege level is zero. * iopl: the pushf and popf instructions are made i/o privilege level sensitive in virtual m8086 mode. 5) however, registers which have been previ- ously stored may be reloaded without mask- ing. depending upon the values of undefined regis- ter bits will make your software dependent upon the unspecified military intel386 handling of these bits. depending on undefined values risks making your software incompatible with future processors that define usages for the military intel386 microprocessor's undefined bits. avoid any software dependence upon the state of undefined military intel386 mi- croprocessor register bits. 1.4 instruction set 1.4.1 instruction set overview the instruction set is divided into nine categories of operations: data transfer arithmetic shift/rotate string manipulation bit manipulation control transfer high level language support operating system support processor control these military intel386 processor instructions are listed in table 2-2. all military intel386 processor instructions operate on either 0, 1, 2, or 3 operands; where an operand resides in a register, in the instruction itself, or in memory. most zero operand instructions (e.g. cli, sti) take only one byte. one operand instructions generally are two bytes long. the average instruc- tion is 3.2 bytes long. since the military intel386 processor has a 16-byte instruction queue, an aver- age of 5 instructions will be prefetched. the use of two operands permits the following types of com- mon instructions: register to register memory to register immediate to register register to memory immediate to memory. the operands can be either 8, 16, or 32 bits long. as a general rule, when executing code written for the military intel386 processor (32-bit code), operands are 8 or 32 bits; when executing existing m80286 or m8086 code (16-bit code), operands are 8 or 16 bits. prefixes can be added to all instructions which over- ride the default length of the operands, (i.e. use 32- bit operands for 16-bit code, or 16-bit operands for 32-bit code). 16
military intel386 tm microprocessor 1.4.2 military intel386 tm microprocessor instructions table 2-2a. data transfer general purpose mov move operand push push operand onto stack pop pop operand off stack pusha push all registers on stack popa pop all registers off stack xchg exchange operand, register xlat translate conversion movzx move byte or word, dword, with zero extension movsx move byte or word, dword, sign extended cbw convert byte to word, or word to dword cwd convert word to dword cwde convert word to dword extended cdq convert dword to qword input/output in input operand from i/o space out output operand to i/o space address object lea load effective address lds load pointer into d segment register les load pointer into e segment register lfs load pointer into f segment register lgs load pointer into g segment register lss load pointer into s (stack) segment register flag manipulation lahf load a register from flags sahf store a register in flags pushf push flags onto stack popf pop flags off stack pushfd push eflags onto stack popfd pop eflags off stack clc clear carry flag cld clear direction flag cmc complement carry flag stc set carry flag std set direction flag table 2-2b. arithmetic instructions addition add add operands adc add with carry inc increment operand by 1 aaa ascii adjust for addition daa decimal adjust for addition subtraction sub subtract operands sbb subtract with borrow dec decrement operand by 1 neg negate operand cmp compare operands das decimal adjust for subtraction aas ascii adjust for subtraction multiplication mul multiply double/single precision imul integer multiply aam ascii adjust after multiply division div divide unsigned idiv integer divide aad ascii adjust before division table 2-2c. string instructions movs move byte or word, dword string ins input string from i/o space outs output string to i/o space cmps compare byte or word, dword string scas scan byte or word, dword string lods load byte or word, dword string stos store byte or word, dword string rep repeat repe/ repz repeat while equal/zero rene/ repnz repeat while not equal/not zero table 2-2d. logical instructions logicals not ``not'' operands and ``and'' operands or ``inclusive or'' operands xor ``exclusive or'' operands test ``test'' operands 17
military intel386 tm microprocessor table 2-2d. logical instructions (continued) shifts shl/shr shift logical left or right sal/sar shift arithmetic left or right shld/ shrd double shift left or right rotates rol/ror rotate left/right rcl/rcr rotate through carry left/right table 2-2e. bit manipulation instructions single bit instructions bt bit test bts bit test and set btr bit test and reset btc bit test and complement bsf bit scan forward bsr bit scan reverse table 2-2f. program control instructions conditional transfers setcc set byte equal to condition code ja/jnbe jump if above/not below nor equal jae/jnb jump if above or equal/not below jb/jnae jump if below/not above nor equal jbe/jna jump if below or equal/not above jc jump if carry je/jz jump if equal/zero jg/jnle jump if greater/not less nor equal jge/jnl jump if greater or equal/not less jl/jnge jump if less/not greater nor equal jle/jng jump if less or equal/not greater jnc jump if not carry jne/jnz jump if not equal/not zero jno jump if not overflow jnp/jpo jump if not parity/parity odd jns jump if not sign jo jump if overflow jp/jpe jump if parity/parity even js jump if sign unconditional transfers call call procedure/task ret return from procedure jmp jump table 2-2f. program control instructions (continued) iteration controls loop loop loope/ loopz loop if equal/zero loopne/ loopnz loop if not equal/not zero jcxz jump if register cx e 0 interrupts int interrupt into interrupt if overflow iret return from interrupt/task cli clear interrupt enable sti set interrupt enable table 2-2g. high level language instructions bound check array bounds enter setup parameter block for entering procedure leave leave procedure table 2-2h. protection model sgdt store global descriptor table sidt store interrupt descriptor table str store task register sldt store local descriptor table lgdt load global descriptor table lidt load interrupt descriptor table ltr load task register lldt load local descriptor table arpl adjust requested privilege level lar load access rights lsl load segment limit verr/ verw verify segment for reading or writing lmsw load machine status word (lower 16 bits of cr0) smsw store machine status word table 2-2i. processor control instructions hlt halt wait wait until busy negated esc escape lock lock bus 18
military intel386 tm microprocessor 1.5 addressing modes 1.5.1 addressing modes overview the military intel386 microprocessor provides a total of 11 addressing modes for instructions to specify operands. the addressing modes are optimized to allow the efficient execution of high level languages such as c and fortran, and they cover the vast majority of data references needed by high-level lan- guages. 1.5.2 register and immediate modes two of the addressing modes provide for instruc- tions that operate on register or immediate oper- ands: register operand mode: the operand is located in one of the 8-, 16- or 32-bit general registers. immediate operand mode: the operand is in- cluded in the instruction as part of the opcode. 1.5.3 32-bit memory addressing modes the remaining 9 modes provide a mechanism for specifying the effective address of an operand. the linear address consists of two components: the seg- ment base address and an effective address. the effective address is calculated by using combina- tions of the following four address elements: displacement: an 8-, or 32-bit immediate value, following the instruction. base: the contents of any general purpose regis- ter. the base registers are generally used by compil- ers to point to the start of the local variable area. index: the contents of any general purpose regis- ter except for esp. the index registers are used to access the elements of an array, or a string of char- acters. scale: the index register's value can be multiplied by a scale factor, either 1, 2, 4 or 8. scaled index mode is especially useful for accessing arrays or structures. combinations of these 4 components make up the 9 additional addressing modes. there is no perform- ance penalty for using any of these addressing com- binations, since the effective address calculation is pipelined with the execution of other instructions. the one exception is the simultaneous use of base and index components which requires one addition- al clock. as shown in figure 2-9, the effective address (ea) of an operand is calculated according to the following formula. ea e base reg a (index reg * scaling) a displacement direct mode: the operand's offset is contained as part of the instruction as an 8-, 16- or 32-bit dis- placement. example: inc word ptr [ 500 ] register indirect mode: a base register contains the address of the operand. example: mov [ ecx ] , edx based mode: a base register's contents is added to a displacement to form the operands offset. example: mov ecx, [ eax a 24 ] index mode: an index register's contents is added to a displacement to form the operands offset. example: add eax, table [ esi ] scaled index mode: an index register's contents is multiplied by a scaling factor which is added to a displacement to form the operands offset. example: imul ebx, table [ esi * 4 ] ,7 based index mode: the contents of a base register is added to the contents of an index register to form the effective address of an operand. example: mov eax, [ esi ][ ebx ] based scaled index mode: the contents of an in- dex register is multiplied by a scaling factor and the result is added to the contents of a base regis- ter to obtain the operands offset. example: mov ecx, [ edx * 8 ][ eax ] based index mode with displacement: the contents of an index register and a base register's con- tents and a displacement are all summed to- gether to form the operand offset. example: add edx, [ esi ][ ebp a 00fffff0h ] based scaled index mode with displacement: the contents of an index register are multiplied by a scaling factor, the result is added to the contents of a base register and a displacement to form the operand's offset. example: mov eax, localtable [ edi * 4 ] [ ebp a 80 ] 19
military intel386 tm microprocessor 271052 51 figure 2-9. addressing mode calculations 1.5.4 differences between 16- and 32-bit addresses in order to provide software compatibility with the m80286 and the m8086, the military intel386 micro- processor can execute 16-bit instructions in real and protected modes. the processor determines the size of the instructions it is executing by examin- ing the d bit in the cs segment descriptor. if the d bit is 0 then all operand lengths and effective ad- dresses are assumed to be 16 bits long. if the d bit is 1 then the default length for operands and ad- dresses is 32 bits. in real mode the default size for operands and addresses is 16-bits. regardless of the default precision of the operands or addresses, the military intel386 microprocessor is able to execute either 16 or 32-bit instructions. this is specified via the use of override prefixes. two pre- fixes, the operand size prefix and the address length prefix , override the value of the d bit on an individual instruction basis. these prefixes are auto- matically added by intel assemblers. example: the processor is executing in real mode and the programmer needs to access the eax regis- ters. the assembler code for this might be mov eax, 32bitmemoryop, asm386 automatical- ly determines that an operand size prefix is needed and generates it. example: the d bit is 0, and the programmer wishes to use scaled index addressing mode to access an array. the address length prefix allows the use of mov dx, table [ esi * 2 ] . the assembler uses an address length prefix since, with d e 0, the default addressing mode is 16-bits. example: the d bit is 1, and the program wants to store a 16-bit quantity. the operand length prefix is used to specify only a 16-bit value; mov mem16, dx. 20
military intel386 tm microprocessor table 2-3. base and index registers for 16- and 32-bit addresses 16-bit addressing 32-bit addressing base register bx,bp any 32-bit gp register index register si,di any 32-bit gp register except esp scale factor none 1, 2, 4, 8 displacement 0, 8, 16 bits 0, 8, 32 bits the operand length and address length pre- fixes can be applied separately or in combination to any instruction. the address length prefix does not allow addresses over 64 kbytes to be accessed in real mode. a memory address which exceeds ffffh will result in a general protection fault. an address length prefix only allows the use of the ad- ditional military intel386 microprocessor addressing modes. when executing 32-bit code, the military intel386 mi- croprocessor uses either 8-, or 32-bit displacements, and any register can be used as base or index regis- ters. when executing 16-bit code, the displacements are either 8, or 16 bits, and the base and index regis- ter conform to the 286 model. table 2-3 illustrates the differences. 1.6 data types the military intel386 microprocessor supports all of the data types commonly used in high level lan- guages: bit: a single bit quantity. bit field: a group of up to 32 contiguous bits, which spans a maximum of four bytes. bit string: a set of contiguous bits, on the military intel386 microprocessor bit strings can be up to 4 gigabits long. byte: a signed 8-bit quantity. unsigned byte: an unsigned 8-bit quantity. integer (word): a signed 16-bit quantity. long integer (double word): a signed 32-bit quan- tity. all operations assume a 2's complement rep- resentation. unsigned integer (word): an unsigned 16-bit quantity. unsigned long integer (double word): an un- signed 32-bit quantity. signed quad word: a signed 64-bit quantity. unsigned quad word: an unsigned 64-bit quanti- ty. offset: a 16- or 32-bit offset only quantity which indirectly references another memory location. pointer: a full pointer which consists of a 16-bit segment selector and either a 16- or 32-bit offset. char: a byte representation of an ascii alphanu- meric or control character. string: a contiguous sequence of bytes, words or dwords. a string may contain between 1 byte and 4 gbytes. bcd: a byte (unpacked) representation of decimal digits 0 9. packed bcd: a byte (packed) representation of two decimal digits 0 9 storing one digit in each nibble. when the military intel386 microprocessor is cou- pled with a numerics coprocessor such as the m80287 or the military i387 coprocessor then the following common floating point types are support- ed. floating point: a signed 32-, 64-, or 80-bit real number representation. floating point numbers are supported by the m80287 and military i387 nu- merics coprocessor. figure 2-10 illustrates the data types supported by the military intel386 processor and the military i387 coprocessor. 21
military intel386 tm microprocessor 271052 52 figure 2-10. military intel386 tm microprocessor supported data types 22
military intel386 tm microprocessor 1.7 memory organization 1.7.1 introduction memory on the military intel386 microprocessor is divided up into 8-bit quantities (bytes), 16-bit quanti- ties (words), and 32-bit quantities (dwords). words are stored in two consecutive bytes in memory with the low-order byte at the lowest address, the high order byte at the high address. dwords are stored in four consecutive bytes in memory with the low-order byte at the lowest address, the high-order byte at the highest address. the address of a word or dword is the byte address of the low-order byte. in addition to these basic data types the military intel386 microprocessor supports two larger units of memory: pages and segments. memory can be di- vided up into one or more variable length segments, which can be swapped to disk or shared between programs. memory can also be organized into one or more 4 kbyte pages. finally, both segmentation and paging can be combined, gaining the advan- tages of both systems. the military intel386 micro- processor supports both pages and segments in or- der to provide maximum flexibility to the system de- signer. segmentation and paging are complementa- ry. segmentation is useful for organizing memory in logical modules, and as such is a tool for the appli- cation programmer, while pages are useful for the system programmer for managing the physical mem- ory of a system. 1.7.2 address spaces the military intel386 microprocessor has three dis- tinct address spaces: logical, linear, and physical . a logical address (also known as a virtual address) consists of a selector and an offset. a selector is the contents of a segment register. an offset is formed by summing all of the addressing components (base, index, displacement) discussed in sec- tion 1.5.3 memory addressing modes into an ef- fective address. since each task on military intel386 microprocessor has a maximum of 16k (2 14 b 1) se- lectors, and offsets can be 4 gigabytes, (2 32 bits) this gives a total of 2 46 bits or 64 terabytes of logi- cal address space per task. the programmer sees this virtual address space. the segmentation unit translates the logical ad- dress space into a 32-bit linear address space. if the paging unit is not enabled then the 32-bit linear ad- dress corresponds to the physical address. the paging unit translates the linear address space into the physical address space. the physical address is what appears on the address pins. the primary difference between real mode and pro- tected mode is how the segmentation unit performs the translation of the logical address into the linear address. in real mode, the segmentation unit shifts the selector left four bits and adds the result to the offset to form the linear address. while in protected mode every selector has a linear base address as- sociated with it. the linear base address is stored in one of two operating system tables (i.e. the local descriptor table or global descriptor table). the selector's linear base address is added to the offset to form the final linear address. figure 2-11 shows the relationship between the vari- ous address spaces. 271052 53 figure 2-11. address translation 23
military intel386 tm microprocessor 1.7.3 segment register usage the main data structure used to organize memory is the segment. on the military intel386 microproces- sor, segments are variable sized blocks of linear ad- dresses which have certain attributes associated with them. there are two main types of segments: code and data, the segments are of variable size and can be as small as 1 byte or as large as 4 giga- bytes (2 32 bytes). in order to provide compact instruction encoding, and increase processor performance, instructions do not need to explicitly specify which segment reg- ister is used. a default segment register is automati- cally chosen according to the rules of table 2-4 (segment register selection rules). in general, data references use the selector contained in the ds reg- ister; stack references use the ss register and in- struction fetches use the cs register. the contents of the instruction pointer provides the offset. special segment override prefixes allow the explicit use of a given segment register, and override the implicit rules listed in table 2-4. the override prefixes also allow the use of the es, fs and gs segment regis- ters. there are no restrictions regarding the overlapping of the base addresses of any segments. thus, all 6 segments could have the base address set to zero and create a system with a four gigabyte linear ad- dress space. this creates a system where the virtual address space is the same as the linear address space. further details of segmentation are dis- cussed in section 3.1. 1.8 i/o space the military intel386 microprocessor has two distinct physical address spaces: memory and i/o. general- ly, peripherals are placed in i/o space although the military intel386 processor also supports memory- mapped peripherals. the i/o space consists of 64 kbytes, it can be divided into 64k 8-bit ports, 32k 16-bit ports, or 16k 32-bit ports, or any combination of ports which add up to less than 64 kbytes. the 64k i/o address space refers to physical memory rather than linear address since i/o instructions do not go through the segmentation or paging hard- ware. the m/io pin acts as an additional address line thus allowing the system designer to easily de- termine which address space the processor is ac- cessing. table 2-4. segment register selection rules type of implied (default) segment override memory reference segment use prefixes possible code fetch cs none destination of push, pushf, ss none int, call, pusha instructions source of pop, popa, popf, ss none iret, ret instructions destination of stos, movs, rep es none stos, rep movs instructions (di is base register) other data references, with effective address using base register of: [ eax ] ds ds,cs,ss,es,fs,gs [ ebx ] ds ds,cs,ss,es,fs,gs [ ecx ] ds ds,cs,ss,es,fs,gs [ edx ] ds ds,cs,ss,es,fs,gs [ esi ] ds ds,cs,ss,es,fs,gs [ edi ] * ds ds,cs,ss,es,fs,gs [ ebp ] ss ds,cs,ss,es,fs,gs [ esp ] ss ds,cs,ss,es,fs,gs * data references for the memory destination of the stos and movs instructions (and rep stos and rep movs) use di as the base register and es as the segment, with no override possible. 24
military intel386 tm microprocessor the i/o ports are accessed via the in and out i/o instructions, with the port address supplied as an immediate 8-bit constant in the instruction or in the dx register. all 8- and 16-bit port addresses are zero extended on the upper address lines. the i/o in- structions cause the m/io pin to be driven low. i/o port addresses 00f8h through 00ffh are re- served for use by intel. 1.9 interrupts 1.9.1 interrupts and exceptions interrupts and exceptions alter the normal program flow, in order to handle external events, to report errors or exceptional conditions. the difference be- tween interrupts and exceptions is that interrupts are used to handle asynchronous external events while exceptions handle instruction faults. although a pro- gram can generate a software interrupt via an int n instruction, the processor treats software interrupts as exceptions. hardware interrupts occur as the result of an exter- nal event and are classified into two types: maskable or non-maskable. interrupts are serviced after the execution of the current instruction. after the inter- rupt handler is finished servicing the interrupt, exe- cution proceeds with the instruction immediately af- ter the interrupted instruction. sections 1.9.3 and 1.9.4 discuss the differences between maskable and non-maskable interrupts. exceptions are classified as faults, traps, or aborts depending on the way they are reported, and wheth- er or not restart of the instruction causing the excep- tion is supported. faults are exceptions that are de- tected and serviced before the execution of the faulting instruction. a fault would occur in a virtual memory system, when the processor referenced a page or a segment which was not present. the oper- ating system would fetch the page or segment from disk, and then the military intel386 microprocessor would restart the instruction. traps are exceptions that are reported immediately after the execution of the instruction which caused the problem. user de- fined interrupts are examples of traps. aborts are exceptions which do not permit the precise location of the instruction causing the exception to be deter- mined. aborts are used to report severe errors, such as a hardware error, or illegal values in system ta- bles. thus, when an interrupt service routine has been completed, execution proceeds from the instruction immediately following the interrupted instruction. on the other hand, the return address from an excep- tion fault routine will always point at the instruction causing the exception and include any leading in- struction prefixes. table 2-5 summarizes the possi- ble interrupts for the military intel386 microproces- sor and shows where the return address points. the military intel386 microprocessor has the ability to handle up to 256 different interrupts/exceptions. in order to service the interrupts, a table with up to 256 interrupt vectors must be defined. the interrupt vectors are simply pointers to the appropriate inter- rupt service routine. in real mode (see section 2.1), the vectors are 4 byte quantities, a code segment plus a 16-bit offset; in protected mode, the interrupt vectors are 8 byte quantities, which are put in an interrupt descriptor table (see section 3.1). of the 256 possible interrupts, 32 are reserved for use by intel, the remaining 224 are free to be used by the system designer. 1.9.2 interrupt processing when an interrupt occurs the following actions hap- pen. first, the current program address and the flags are saved on the stack to allow resumption of the interrupted program. next, an 8-bit vector is sup- plied to the military intel386 microprocessor which identifies the appropriate entry in the interrupt table. the table contains the starting address of the inter- rupt service routine. then, the user supplied inter- rupt service routine is executed. finally, when an iret instruction is executed the old processor state is restored and program execution resumes at the appropriate instruction. the 8-bit interrupt vector is supplied to the military intel386 microprocessor in several different ways: exceptions supply the interrupt vector internally; software int instructions contain or imply the vector; maskable hardware interrupts supply the 8-bit vector via the interrupt acknowledge bus sequence. non- maskable hardware interrupts are assigned to inter- rupt vector 2. 1.9.3 maskable interrupt maskable interrupts are the most common way used by the military intel386 microprocessor to respond to asynchronous external hardware events. a hard- ware interrupt occurs when the intr is pulled high and the interrupt flag bit (if) is enabled. the proc- essor only responds to interrupts between instruc- 25
military intel386 tm microprocessor table 2-5. interrupt vector assignments instruction which return address function interrupt can cause points to type number exception faulting instruction divide error 0 div, idiv yes fault debug exception 1 any instruction yes trap * nmi interrupt 2 int 2 or nmi no nmi one byte interrupt 3 int no trap interrupt on overflow 4 into no trap array bounds check 5 bound yes fault invalid op-code 6 any illegal instruction yes fault device not available 7 esc, wait yes fault double fault 8 any instruction that can abort generate an exception invalid tss 10 jmp, call, iret, int yes fault segment not present 11 segment register instructions yes fault stack fault 12 stack references yes fault general protection fault 13 any memory reference yes fault page fault 14 any memory access or code fetch yes fault coprocessor error 16 esc, wait yes fault intel reserved 17 32 two byte interrupt 0 255 int n no trap * some debug exceptions may report both traps on the previous instruction, and faults on the next instruction. note: exception 9 no longer occurs on the m80386 due to the improved interface between the m80386 and its coprocessors. tions, (repeat string instructions, have an ``interrupt window'', between memory moves, which allows in- terrupts during long string moves). when an interrupt occurs the processor reads an 8-bit vector supplied by the hardware which identifies the source of the interrupt, (one of 224 user defined interrupts). the exact nature of the interrupt sequence is discussed in section 4. the if bit in the eflag registers is reset when an interrupt is being serviced. this effectively disables servicing additional interrupts during an interrupt service routine. however, the if may be set explicitly by the interrupt handler, to allow the nesting of inter- rupts. when an iret instruction is executed the original state of the if is restored. 1.9.4 non-maskable interrupt non-maskable interrupts provide a method of servic- ing very high priority interrupts. a common example of the use of a non-maskable interrupt (nmi) would be to activate a power failure routine. when the nmi input is pulled high it causes an interrupt with an internally supplied vector value of 2. unlike a normal hardware interrupt, no interrupt acknowledgment se- quence is performed for an nmi. while executing the nmi servicing procedure, the military intel386 microprocessor will not service fur- ther nmi requests, until an interrupt return (iret) instruction is executed or the processor is reset. if nmi occurs while currently servicing an nmi, its presence will be saved for servicing after executing the first iret instruction. the if bit is cleared at the beginning of an nmi interrupt to inhibit further intr interrupts. 1.9.5 software interrupts a third type of interrupt/exception for the military intel386 microprocessor is the software interrupt. an int n instruction causes the processor to execute the interrupt service routine pointed to by the nth vector in the interrupt table. 26
military intel386 tm microprocessor a special case of the two byte software interrupt int n is the one byte int 3, or breakpoint interrupt. by inserting this one byte instruction in a program, the user can set breakpoints in his program as a debug- ging tool. a final type of software interrupt, is the single step interrupt. it is discussed in section 1.12. 1.9.6 interrupt and exception priorities interrupts are externally-generated events. maska- ble interrupts (on the intr input) and non-maskable interrupts (on the nmi input) are recognized at in- struction boundaries. when nmi and maskable intr are both recognized at the same instruction boundary, the military intel386 microprocessor in- vokes the nmi service routine first. if, after the nmi service routine has been invoked, maskable inter- rupts are still enabled, then the military intel386 mi- croprocessor will invoke the appropriate interrupt service routine. table 2-6a. military intel386 microprocessor priority for invoking service routines in case of simultaneous external interrupts 1. nmi 2. intr exceptions are internally-generated events. excep- tions are detected by the military intel386 microproc- essor if, in the course of executing an instruction, the military intel386 microprocessor detects a problem- atic condition. the military intel386 microprocessor then immediately invokes the appropriate exception service routine. the state of the military intel386 mi- croprocessor is such that the instruction causing the exception can be restarted. if the exception service routine has taken care of the problematic condition, the instruction will execute without causing the same exception. it is possible for a single instruction to generate sev- eral exceptions (for example, transferring a single operand could generate two page faults if the oper- and location spans two ``not present'' pages). how- ever, only one exception is generated upon each at- tempt to execute the instruction. each exception service routine should correct its corresponding ex- ception, and restart the instruction. in this manner, exceptions are serviced until the instruction exe- cutes successfully. as the military intel386 microprocessor executes in- structions, it follows a consistent cycle in checking for exceptions, as shown in table 2-6b. this cycle is repeated as each instruction is executed, and oc- curs in parallel with instruction decoding and execu- tion. table 2-6b. sequence of exception checking consider the case of the military intel386 micro- processor having just completed an instruction. it then performs the following checks before reach- ing the point where the next instruction is com- pleted: 1. check for exception 1 traps from the instruc- tion just completed (single-step via trap flag, or data breakpoints set in the debug regis- ters). 2. check for exception 1 faults in the next in- struction (instruction execution breakpoint set in the debug registers for the next instruc- tion). 3. check for external nmi and intr. 4. check for segmentation faults that prevented fetching the entire next instruction (exceptions 11 or 13). 5. check for page faults that prevented fetching the entire next instruction (exception 14). 6. check for faults decoding the next instruction (exception 6 if illegal opcode; exception 6 if in real mode or in virtual m8086 mode and at- tempting to execute an instruction for protect- ed mode only (see 3.6.4); or exception 13 if instruction is longer than 15 bytes, or privilege violation in protected mode (i.e. not at iopl or at cpl e 0). 7. if wait opcode, check if ts e 1 and mp e 1 (exception 7 if both are 1). 8. if escape opcode for numeric coprocessor, check if em e 1orts e 1 (exception 7 if either are 1). 9. if wait opcode or escape opcode for nu- meric coprocessor, check error input signal (exception 16 if error input is asserted). 10. check in the following order for each memo- ry reference required by the instruction: a. check for segmentation faults that pre- vent transferring the entire memory quanti- ty (exceptions 11, 12, 13). b. check for page faults that prevent trans- ferring the entire memory quantity (excep- tion 14). note that the order stated supports the concept of the paging mechanism being ``underneath'' under segmentation mecha- nism. therefore, for any given code or data reference in memory, segmentation exceptions are generated before paging exceptions are generated. 27
military intel386 tm microprocessor 1.9.7 instruction restart the military intel386 microprocessor fully supports restarting all instructions after faults. if an exception is detected in the instruction to be executed (excep- tion categories 4 through 10 in table 2-6c), the mili- tary intel386 microprocessor invokes the appropri- ate exception service routine. the military intel386 microprocessor is in a state that permits restart of the instruction, for all cases but those in table 2-6c. note that all such cases are easily avoided by prop- er design of the operating system. table 2-6c. conditions preventing instruction restart a. an instruction causes a task switch to a task whose task state segment is partially ``not present''. (an entirely ``not present'' tss is re- startable.) partially present tss's can be avoided either by keeping the tss's of such tasks present in memory, or by aligning tss segments to reside entirely within a single 4k page (for tss segments of 4 kbytes or less). b. a coprocessor operand wraps around the top of a 64 kbyte segment or a 4 gbyte segment, and spans three pages, and the page holding the middle portion of the operand is ``not pres- ent.'' this condition can be avoided by starting at a page boundary any segments containing coprocessor operands if the segments are ap- proximately 64k-200 bytes or larger (i.e. large enough for wraparound of the coprocessor operand to possibly occur). note that these conditions are avoided by using the operating system designs mentioned in this table. 1.9.8 double fault a double fault (exception 8) results when the proc- essor attempts to invoke an exception service rou- tine for the segment exceptions (10, 11, 12 or 13), but in the process of doing so, detects an exception other than a page fault (exception 14). a double fault (exception 8) will also be generated when the processor attempts to invoke the page fault (exception 14) service routine, and detects an exception other than a second page fault. in any functional system, the entire page fault service rou- tine must remain ``present'' in memory. double page faults however do not raise the double fault exception. if a second page fault occurs while the processor is attempting to enter the service rou- tine for the first time, then the processor will invoke the page fault (exception 14) handler a second time rather than the double page fault (exception 8) han- dler. a subsequent fault, though, will lead to shut- down. when a double fault occurs, the military intel386 microprocessor invokes the exception service rou- tine for exception 8. 1.10 reset and initialization when the processor is initialized or reset the regis- ters have the values shown in table 2-7. the m80386 will then start executing instructions near the top of physical memory, at location fffffff0h. when the first intersegment jump or call is execut- ed, address lines a20-31 will drop low for cs-rela- tive memory cycles, and the military intel386 micro- processor will only execute instructions in the lower one megabyte of physical memory. this allows the system designer to use a rom at the top of physical memory to initialize the system and take care of re- sets. reset forces the military intel386 processor to ter- minate all execution and local bus activity. no in- struction execution or bus activity will occur as long as reset is active. between 350 and 450 clk2 peri- ods after reset becomes inactive the military intel386 processor will start executing instructions at the top of physical memory. table 2-7. register values after reset flag word uuuu0002h note 1 machine status word (cr0) uuuuuuu0h note 2 instruction pointer 0000fff0h code segment f000h note 3 data segment 0000h stack segment 0000h extra segment (es) 0000h extra segment (fs) 0000h extra segment (gs) 0000h dx register component and stepping id note 5 all other registers undefined note 4 notes: 1. eflag register. the upper 14 bits of the eflags reg- ister are undefined, vm (bit 17) and rf (bit) 16 are 0 as are all other defined flag bits. 2. cr0: (machine status word). all of the defined fields in the cr0 are 0 (pg bit 31, ts bit 3, em bit 2, mp bit 1, and pe bit 0) except for et bit 4 (processor extension type). the et bit is set during reset according to the type of co- processor in the system. if the coprocessor is a military i387 coprocessor then et will be 1, if the coprocessor is an m80287 or no coprocessor is present then et will be 0. all other bits are undefined. 3. the code segment register (cs) will have its base ad- dress set to ffff0000h and limit set to 0ffffh. 4. all undefined bits are intel reserved and should not be used. 5. dx register always holds component and stepping iden- tifier (see 4.7). eax register holds self-test signature if self- test was requested (see 4.6). 28
military intel386 tm microprocessor 1.11 testability 1.11.1 self-test the military intel386 microprocessor has the capa- bility to perform a self-test. the self-test checks the function of all of the control rom and most of the non-random logic of the part. approximately one- half of the military intel386 microprocessor can be tested during self-test. self-test is initiated on the military intel386 micro- processor when the reset pin transitions from high to low, and the busy pin is low. the self- test takes about 2 ** 19 clocks, or approximately 33 milliseconds with a 16 mhz military intel386 micro- processor. at the completion of self-test the proces- sor performs reset and begins normal operation. the part has successfully passed self-test if the con- tents of the eax register are zero (0). if the results of eax are not zero then the self-test has detected a flaw in the part. 1.11.2 tlb testing the military intel386 microprocessor provides a mechanism for testing the translation lookaside buffer (tlb) if desired. this particular mechanism is unique to the military intel386 microprocessor and may not be continued in the same way in future processors. when testing the tlb it is recommend- ed that paging be turned off (pg e 0 in cr0) to avoid interference with the test data being written to the tlb. there are two tlb testing operations: 1) write en- tries into the tlb, and, 2) perform tlb lookups. two test registers, shown in figure 2-12, are provided for the purpose of testing. tr6 is the ``test command register'', and tr7 is the ``test data register''. the fields within these registers are defined below. c: this is the command bit. for a write into tr6 to cause an immediate write into the tlb entry, write a 0 to this bit. for a write into tr6 to cause an immedi- ate tlb lookup, write a 1 to this bit. linear address: this is the tag field of the tlb. on a tlb write, a tlb entry is allocated to this linear address and the rest of that tlb entry is set per the value of tr7 and the value just written into tr6. on a tlb lookup, the tlb is interrogated per this value and if one and only one tlb entry matches, the rest of the fields of tr6 and tr7 are set from the match- ing tlb entry. physical address: this is the data field of the tlb. on a write to the tlb, the tlb entry allocated to the linear address in tr6 is set to this value. on a tlb lookup, the data field (physical address) from the tlb is read out to here. pl: on a tlb write, pl e 1 causes the rep field of tr7 to select which of four associative blocks of the tlb is to be written, but pl e 0 allows the internal pointer in the paging unit to select which tlb block is written. on a tlb lookup, the pl bit indicates whether the lookup was a hit (pl gets set to 1) or a miss (pl gets reset to 0). v: the valid bit for this tlb entry. all valid bits can also be cleared by writing to cr3. d, d : the dirty bit for/from the tlb entry. u, u : the user bit for/from the tlb entry. w, w : the writable bit for/from the tlb entry. for d, u and w, both the attribute and its comple- ment are provided as tag bits, to permit the option of a ``don't care'' on tlb lookups. the meaning of these pairs of bits is given in the following table: xx effect during value of bit tlb lookup x after tlb write 0 0 miss all bit x becomes undefined 0 1 match if x e 0 bit x becomes 0 1 0 match if x e 1 bit x becomes 1 1 1 match all bit x becomes undefined for writing a tlb entry: 1. write tr7 for the desired physical address, pl and rep values. 2. write tr6 with the appropriate linear address, etc. (be sure to write c e 0 for ``write'' com- mand). for looking up (reading) a tlb entry: 1. write tr6 with the appropriate linear address (be sure to write c e 1 for ``lookup'' command). 2. read tr7 and tr6. if the pl bit in tr7 indicates a hit, then the other values reveal the tlb con- tents. if pl indicates a miss, then the other values in tr7 and tr6 are indeterminate. 1.12 debugging support the military intel386 microprocessor provides sever- al features which simplify the debugging process. the three categories of on-chip debugging aids are: 1) the code execution breakpoint opcode (0cch), 2) the single-step capability provided by the tf bit in the flag register, and 3) the code and data breakpoint capability provided by the debug registers dr0-3, dr6, and dr7. 29
military intel386 tm microprocessor 31 12 11 0 linear address v d d uu ww 0000ctr6 physical address 0 0 0 0 0 0 0 p rep 0 0 tr7 l note: 0 indicates intel reserved: do not define; see section 1.3.10 figure 2-12. test registers 1.12.1 breakpoint instruction a single-byte-opcode breakpoint instruction is avail- able for use by software debuggers. the breakpoint opcode is 0cch, and generates an exception 3 trap when executed. in typical use, a debugger program can ``plant'' the breakpoint instruction at all desired code execution breakpoints. the single-byte break- point opcode is an alias for the two-byte general software interrupt instruction, int n, where n e 3. the only difference between int 3 (0cch) and int n is that int 3 is never iopl-sensitive but int n is iopl-sensitive in protected mode and virtual m8086 mode. 1.12.2 single-step trap if the single-step flag (tf, bit 8) in the eflag regis- ter is found to be set at the end of an instruction, a single-step exception occurs. the single-step ex- ception is auto vectored to exception number 1. pre- cisely, exception 1 occurs as a trap after the instruc- tion following the instruction which set tf. in typical practice, a debugger sets the tf bit of a flag register image on the debugger's stack. it then typically transfers control to the user program and loads the flag image with a signal instruction, the iret instruc- tion. the single-step trap occurs after executing one instruction of the user program. since the exception 1 occurs as a trap (that is, it occurs after the instruction has already executed), the cs:eip pushed onto the debugger's stack points to the next unexecuted instruction of the program being debugged. an exception 1 handler, merely by ending with an iret instruction, can therefore effi- ciently support single-stepping through a user pro- gram. 1.12.3 debug registers the debug registers are an advanced debugging feature of the military intel386 microprocessor. they allow data access breakpoints as well as code exe- cution breakpoints. since the breakpoints are indi- cated by on-chip registers, an instruction execution breakpoint can be placed in rom code or in code shared by several tasks, neither of which can be supported by the int3 breakpoint opcode. the military intel386 microprocessor contains six debug registers, providing the ability to specify up to four distinct breakpoints addresses, breakpoint control options, and read breakpoint status. initially after reset, breakpoints are in the disabled state. therefore, no breakpoints will occur unless the de- bug registers are programmed. breakpoints set up in the debug registers are autovectored to exception number 1. 1.12.3.1 linear address breakpoint registers (dr0 dr3) up to four breakpoint addresses can be specified by writing into debug registers dr0 dr3, shown in figure 2-13. the breakpoint addresses specified are 32-bit linear addresses. military intel386 microproc- essor hardware continuously compares the linear breakpoint addresses in dr0 dr3 with the linear addresses generated by executing software (a linear address is the result of computing the effective ad- dress and adding the 32-bit segment base address). note that if paging is not enabled the linear address equals the physical address. if paging is enabled, the linear address is translated to a physical 32-bit address by the on-chip paging unit. regardless of whether paging is enabled or not, however, the breakpoint registers hold linear addresses. 1.12.3.2 debug control register (dr7) a debug control register, dr7 shown in figure 2-13, allows several debug control functions such as enabling the breakpoints and setting up other con- trol options for the breakpoints. the fields within the debug control register, dr7, are as follows: leni (breakpoint length specification bits) a 2-bit len field exists for each of the four break- points. len specifies the length of the associated breakpoint field. the choices for data breakpoints are: 1 byte, 2 bytes, and 4 bytes. instruction execu- 30
military intel386 tm microprocessor 31 16 15 0 breakpoint 0 linear address dr0 breakpoint 1 linear address dr1 breakpoint 2 linear address dr2 breakpoint 3 linear address dr3 intel reserved. do not define. dr4 intel reserved. do not define. dr5 0 bbb 000 0 0 0 0 0 0 bbbb dr6 tsd 3210 len r w len r w len r w len r w 00 g 000 glglglglgl dr7 3 3 3 2 2 2 1 1 1 0 0 0 d ee33221100 31 16 15 0 note: 0 indicates intel reserved: do not define; see section 1.3.10 figure 2-13. debug registers tion breakpoints must have a length of 1 (leni e 00). encoding of the leni field is as follows: usage of least leni breakpoint significant bits in encoding field width breakpoint address register i, (i e 0 b 3) 00 1 byte all 32-bits used to specify a single-byte breakpoint field. 01 2 bytes a1 a31 used to specify a two-byte, word-aligned breakpoint field. a0 in breakpoint address register is not used. 10 undefinede do not use this encoding 11 4 bytes a2 a31 used to specify a four-byte, dword-aligned breakpoint field. a0 and a1 in breakpoint address register are not used. the leni field controls the size of breakpoint field i by controlling whether all low-order linear address bits in the breakpoint address register are used to detect the breakpoint event. therefore, all break- point fields are aligned; 2-byte breakpoint fields be- gin on word boundaries, and 4-byte breakpoint fields begin on dword boundaries. the following is an example of various size break- point fields. assume the breakpoint linear address in dr2 is 00000005h. in that situation, the following illustration indicates the region of the breakpoint field for lengths of 1, 2, or 4 bytes. dr2 e 00000005h; len2 e 00b 31 0 00000008h bkpt fld2 00000004h 00000000h dr2 e 00000005h; len2 e 01b 31 0 00000008h w bkpt fld2 x 00000004h 00000000h dr2 e 00000005h; len2 e 11b 31 0 00000008h w bkpt fld2 x 00000004h 00000000h 31
military intel386 tm microprocessor rwi (memory access qualifier bits) a 2-bit rw field exists for each of the four break- points. the 2-bit rw field specifies the type of usage which must occur in order to activate the associated breakpoint. rw usage encoding causing breakpoint 00 instruction execution only 01 data writes only 10 undefinededo not use this encoding 11 data reads and writes only rw encoding 00 is used to set up an instruction execution breakpoint. rw encodings 01 or 11 are used to set up write-only or read/write data break- points. note that instruction execution breakpoints are taken as faults (i.e. before the instruction exe- cutes), but data breakpoints are taken as traps (i.e. after the data transfer takes place). using leni and rwi to set data breakpoint i a data breakpoint can be set up by writing the linear address into dri (i e 0 3). for data breakpoints, rwi can e 01 (write-only) or 11 (write/read). len can e 00, 01, or 11. if a data access entirely or partly falls within the data breakpoint field, the data breakpoint condition has occurred, and if the breakpoint is enabled, an excep- tion 1 trap will occur. using leni and rwi to set instruction execution breakpoint i an instruction execution breakpoint can be set up by writing address of the beginning of the instruction (including prefixes if any) into dri (i e 0 3). rwi must e 00 and len must e 00 for instruction exe- cution breakpoints. if the instruction beginning at the breakpoint address is about to be executed, the instruction execution breakpoint condition has occurred, and if the break- point is enabled, an exception 1 fault will occur be- fore the instruction is executed. note that an instruction execution breakpoint ad- dress must be equal to the beginning byte address of an instruction (including prefixes) in order for the instruction execution breakpoint to occur. gd (global debug register access detect) the debug registers can only be accessed in real mode or at privilege level 0 in protected mode. the gd bit, when set, provides extra protection against any debug register access even in real mode or at privilege level 0 in protected mode. this additional protection feature is provided to guarantee that a software debugger (or ice-386) can have full control over the debug register resources when required. the gd bit, when set, causes an exception 1 fault if an instruction attempts to read or write any debug register. the gd bit is then automatically cleared when the exception 1 handler is invoked, allowing the exception 1 handler free access to the debug registers. ge and le (exact data breakpoint match, global and local) if either ge or le is set, any data breakpoint trap will be reported exactly after completion of the instruc- tion that caused the operand transfer. exact report- ing is provided by forcing the military intel386 micro- processor execution unit to wait for completion of data operand transfers before beginning execution of the next instruction. if exact data breakpoint match is not selected, data breakpoints may not be reported until several in- structions later or may not be reported at all. when enabling a data breakpoint, it is therefore recom- mended to enable the exact data breakpoint match. when the military intel386 microprocessor performs a task switch, the le bit is cleared. thus, the le bit supports fast task switching out of tasks, that have enabled the exact data breakpoint match for their task-local breakpoints. the le bit is cleared by the processor during a task switch, to avoid having ex- act data breakpoint match enabled in the new task. note that exact data breakpoint match must be re- enabled under software control. the military intel386 microprocessor ge bit is unaf- fected during a task switch. the ge bit supports ex- act data breakpoint match that is to remain enabled during all tasks executing in the system. note that instruction execution breakpoints are al- ways reported exactly, whether or not exact data breakpoint match is selected. gi and li (breakpoint enable, global and local) if either gi or li is set then the associated breakpoint (as defined by the linear address in dri, the length in leni and the usage criteria in rwi) is enabled. if either gi or li is set, and the military intel386 micro- processor detects the ith breakpoint condition, then the exception 1 handler is invoked. when the military intel386 microprocessor performs a task switch to a new tss, all li bits are cleared. thus, the li bits support fast task switching out of tasks that use some task-local breakpoint registers. 32
military intel386 tm microprocessor the li bits are cleared by the processor during a task switch, to avoid spurious exceptions in the new task. note that the breakpoints must be re-enabled under software control. all military intel386 microprocessor gi bits are unaf- fected during a task switch. the gi bits support breakpoints that are active in all tasks executing in the system. 1.12.3.3 debug status register (dr6) a debug status register, dr6 shown in figure 2-13, allows the exception 1 handler to easily determine why it was invoked. note the exception 1 handler can be invoked as a result of one of several events: 1) dr0 breakpoint fault/trap. 2) dr1 breakpoint fault/trap. 3) dr2 breakpoint fault/trap. 4) dr3 breakpoint fault/trap. 5) single-step (tf) trap. 6) task switch trap. 7) fault due to attempted debug register access when gd e 1. the debug status register contains single-bit flags for each of the possible events invoking exception 1. note below that some of these events are faults (ex- ception taken before the instruction is executed), while other events are traps (exception taken after the debug events occurred). the flags in dr6 are set by the hardware but never cleared by hardware. exception 1 handler software should clear dr6 before returning to the user pro- gram to avoid future confusion in identifying the source of exception 1. the fields within the debug status register, dr6, are as follows: bi (debug fault/trap due to breakpoint 0 3) four breakpoint indicator flags, b0 b3, correspond one-to-one with the breakpoint registers in dr0 dr3. a flag bi is set when the condition described by dri, leni, and rwi occurs. if gi or li is set, and if the ith breakpoint is detected, the processor will invoke the exception 1 handler. the exception is handled as a fault if an instruction execution breakpoint occurred, or as a trap if a data breakpoint occurred. important note: a flag bi is set whenever the hardware detects a match condition on enabled breakpoint i. whenever a match is detected on at least one enabled breakpoint i, the hardware imme- diately sets all bi bits corresponding to breakpoint conditions matching at that instant, whether enabled or not. therefore, the exception 1 handler may see that multiple bi bits are set, but only set bi bits corre- sponding to enabled breakpoints (li or gi set) are true indications of why the exception 1 handler was invoked. bd (debug fault due to attempted register access when gd bit set) this bit is set if the exception 1 handler was invoked due to an instruction attempting to read or write to the debug registers when gd bit was set. if such an event occurs, then the gd bit is automatically cleared when the exception 1 handler is invoked, allowing handler access to the debug registers. bs (debug trap due to single-step) this bit is set if the exception 1 handler was invoked due to the tf bit in the flag register being set (for single-stepping). see section 1.12.2. bt (debug trap due to task switch) this bit is set if the exception 1 handler was invoked due to a task switch occurring to a task having a military intel386 microprocessor tss with the t bit set. (see figure 4-15a). note the task switch into the new task occurs normally, but before the first in- struction of the task is executed, the exception 1 handler is invoked. with respect to the task switch operation, the operation is considered to be a trap. 1.12.3.4 use of resume flag (rf) in flag register the resume flag (rf) in the flag word can sup- press an instruction execution breakpoint when the exception 1 handler returns to a user program at a user address which is also an instruction execution breakpoint. see section 1.3.3. 2.0 real mode architecture 2.1 real mode introduction when the processor is reset or powered up it is ini- tialized in real mode. real mode has the same base architecture as the m8086, but allows access to the 32-bit register set of the military intel386 microproc- essor. the addressing mechanism, memory size, in- terrupt handling, are all identical to the real mode on the m80286. all of the military intel386 microprocessor instruc- tions are available in real mode (except those in- structions listed in 3.6.4). the default operand size in real mode is 16-bits, just like the m8086. in order to use the 32-bit registers and addressing modes, over- ride prefixes must be used. in addition, the segment size on the military intel386 microprocessor in real 33
military intel386 tm microprocessor 271052 54 figure 3-1. real address mode addressing mode is 64 kbytes so 32-bit effective addresses must have a value less the 0000ffffh. the primary purpose of real mode is to set up the processor for protected mode operation. the lock prefix on the military intel386 processor, even in real mode, is more restrictive than on the m80286. this is due to the addition of paging on the military intel386 processor in protected mode and virtual m8086 mode. paging makes it impossible to guarantee that repeated string instructions can be locked. the military intel386 processor can't re- quire that all pages holding the string be physically present in memory. hence, a page fault (exception 14) might have to be taken during the repeated string instruction. therefore the lock prefix can't be supported during repeated string instructions. these are the only instruction forms where the lock prefix is legal on the military intel386 proces- sor: opcode operands (dest, source) bit test and mem, reg/immed set/reset/complement xchg reg, mem xchg mem, reg add, or, adc, sbb, mem, reg/immed and, sub, xor not, neg, inc, dec mem an exception 6 will be generated if a lock prefix is placed before any instruction form or opcode not listed above. the lock prefix allows indivisible read/modify/write operations on memory operands using the instructions above. for example, even the ``add reg, mem'' is not lockable, because the mem operand is not the destination (and therefore no memory read/modify/operation is being per- formed). since, on the military intel386 microprocessor, re- peated string instructions are not lockable, it is not possible to lock the bus for a long period of time. therefore, the lock prefix is not iopl-sensitive on the military intel386 microprocessor. the lock pre- fix can be used at any privilege level, but only on the instruction forms listed above. 2.2 memory addressing in real mode the maximum memory size is limited to 1 megabyte. thus, only address lines a2 a19 are active. (exception, the high address lines a20 a31 are high during cs-relative memory cycles until an intersegment jump or call is executed (see section 1.10)). since paging is not allowed in real mode the linear addresses are the same as physical addresses. physical addresses are formed in real mode by adding the contents of the appropriate segment reg- ister which is shifted left by four bits to an effective address. this addition results in a physical address from 00000000h to 0010ffefh. this is compatible with m80286 real mode. since segment registers are shifted left by 4 bits this implies that real mode segments always start on 16 byte boundaries. all segments in real mode are exactly 64 kbytes long, and may be read, written, or executed. the military intel386 processor will generate an excep- tion 13 if a data operand or instruction fetch occurs past the end of a segment. (i.e. if an operand has an offset greater than ffffh, for example a word with a low byte at ffffh and the high byte at 0000h) segments may be overlapped in real mode. thus, if a particular segment does not use all 64 kbytes an- other segment can be overlayed on top of the un- used portion of the previous segment. this allows the programmer to minimize the amount of physical memory needed for a program. 34
military intel386 tm microprocessor 2.3 reserved locations there are two fixed areas in memory which are re- served in real address mode: system initialization area and the interrupt table area. locations 00000h through 003ffh are reserved for interrupt vectors. each one of the 256 possible interrupts has a 4-byte jump vector reserved for it. locations fffffff0h through ffffffffh are reserved for system initiali- zation. 2.4 interrupts many of the exceptions shown in table 2-5 and dis- cussed in section 1.9 are not applicable to real mode operation, in particular exceptions 10, 11, 14, will not happen in real mode. other exceptions have slightly different meanings in real mode; table 3-1 identifies these exceptions. 2.5 shutdown and halt the hlt instruction stops program execution and prevents the processor from using the local bus until restarted. either nmi, intr with interrupts enabled (if e 1), or reset will force the military intel386 mi- croprocessor out of halt. if interrupted, the saved cs:ip will point to the next instruction after the hlt. shutdown will occur when a severe error is detected that prevents further processing. in real mode, shutdown can occur under two conditions: an interrupt or an exception occur (exceptions 8 or 13) and the interrupt vector is larger than the interrupt descriptor table (i.e. there is not an in- terrupt handler for the interrupt). a call, int or push instruction attempts to wrap around the stack segment when sp is not even. (e.g. pushing a value on the stack when sp e 0001 resulting in a stack segment greater than ffffh) an nmi input can bring the processor out of shut- down if the interrupt descriptor table limit is large enough to contain the nmi interrupt vector (at least 0017h) and the stack has enough room to contain the vector and flag information (i.e. sp is greater than 0005h). otherwise shutdown can only be exit- ed via the reset input. 3.0 protected mode architecture 3.1 introduction the complete capabilities of the military intel386 mi- croprocessor are unlocked when it operates in pro- tected virtual address mode (protected mode). pro- tected mode vastly increases the linear address space to four gigabytes (2 32 bytes) and allows the running of virtual memory programs of almost unlim- ited size (64 terabytes or 2 46 bytes). in addition pro- tected mode allows the military intel386 processor to run all of the existing m8086 and m80286 soft- ware, while providing a sophisticated memory man- agement and a hardware-assisted protection mech- anism. protected mode allows the use of additional instructions especially optimized for supporting mul- titasking operating systems. the base architecture of the military intel386 microprocessor remains the same, the registers, instructions, and addressing modes described in the previous sections are re- tained. the main difference between protected mode, and real mode from a programmer's view is the increased address space, and a different ad- dressing mechanism. table 3-1 function interrupt related return number instructions address location interrupt table limit too small 8 int vector is not before within table limit instruction cs, ds, es, fs, gs 13 word memory reference before segment overrun exception beyond offset e ffffh. instruction an attempt to execute past the end of cs segment. ss segment overrun exception 12 stack reference before beyond offset e ffffh instruction 35
military intel386 tm microprocessor 3.2 addressing mechanism like real mode, protected mode uses two compo- nents to form the logical address, a 16-bit selector is used to determine the linear base address of a seg- ment, the base address is added to a 32-bit effective address to form a 32-bit linear address. the linear address is then either used as the 32-bit physical address, or if paging is enabled the paging mecha- nism maps the 32-bit linear address into a 32-bit physical address. the difference between the two modes lies in calcu- lating the base address. in protected mode the se- lector is used to specify an index into an operating system defined table (see figure 4-1). the table contains the 32-bit base address of a given seg- ment. the physical address is formed by adding the base address obtained from the table to the offset. paging provides an additional memory management mechanism which operates only in protected mode. paging provides a means of managing the very large segments of the military intel386 microprocessor. as such, paging operates beneath segmentation. the paging mechanism translates the protected linear address which comes from the segmentation unit into a physical address. figure 4-2 shows the com- plete military intel386 microprocessor addressing mechanism with paging enabled. 271052 55 figure 4-1. protected mode addressing 271052 56 figure 4-2. paging and segmentation 36
military intel386 tm microprocessor 3.3 segmentation 3.3.1 segmentation introduction segmentation is one method of memory manage- ment. segmentation provides the basis for protec- tion. segments are used to encapsulate regions of memory which have common attributes. for exam- ple, all of the code of a given program could be con- tained in a segment, or an operating system table may reside in a segment. all information about a segment is stored in an 8 byte data structure called a descriptor. all of the descriptors in a system are contained in tables recognized by hardware. 3.3.2 terminology the following terms are used throughout the discus- sion of descriptors, privilege levels and protection: pl: privilege leveleone of the four hierarchical privilege levels. level 0 is the most privileged level and level 3 is the least privileged. more privileged levels are numerically smaller than less privileged levels. rpl: requestor privilege levelethe privilege level of the original supplier of the selector. rpl is deter- mined by the least two significant bits of a selector. dpl: descriptor privilege levelethis is the least privileged level at which a task may access that de- scriptor (and the segment associated with that de- scriptor). descriptor privilege level is determined by bits 6:5 in the access right byte of a descriptor. cpl: current privilege levelethe privilege level at which a task is currently executing, which equals the privilege level of the code segment being executed. cpl can also be determined by examining the low- est 2 bits of the cs register, except for conforming code segments. epl: effective privilege levelethe effective privi- lege level is the least privileged of the rpl and dpl. since smaller privilege level values indicate greater privilege, epl is the numerical maximum of rpl and dpl. task: one instance of the execution of a program. tasks are also referred to as processes. 3.3.3 descriptor tables 3.3.3.1 descriptor tables introduction the descriptor tables define all of the segments which are used in a military intel386 microprocessor system. there are three types of tables on the mili- tary intel386 microprocessor which hold descriptors: the global descriptor table, local descriptor table, and the interrupt descriptor table. all of the tables are variable length memory arrays. they can range in size between 8 bytes and 64 kbytes. each table can hold up to 8192 8 byte descriptors. the upper 13 bits of a selector are used as an index into the descriptor table. the tables have registers associat- ed with them which hold the 32-bit linear base ad- dress, and the 16-bit limit of each table. each of the tables has a register associated with it the gdtr, ldtr, and the idtr (see figure 4-3). the lgdt, lldt, and lidt instructions, load the base and limit of the global, local, and interrupt de- scriptor tables, respectively, into the appropriate register. the sgdt, sldt, and sidt store the base and limit values. these tables are manipulated by the operating system. therefore, the load descriptor table instructions are privileged instructions. 271052 57 figure 4-3. descriptor table registers 37
military intel386 tm microprocessor 3.3.3.2 global descriptor table the global descriptor table (gdt) contains de- scriptors which are possibly available to all of the tasks in a system. the gdt can contain any type of segment descriptor except for descriptors which are used for servicing interrupts (i.e. interrupt and trap descriptors). every military intel386 microprocessor system contains a gdt. generally the gdt contains code and data segments used by the operating sys- tems and task state segments, and descriptors for the ldts in a system. the first slot of the global descriptor table corre- sponds to the null selector and is not used. the null selector defines a null pointer value. 3.3.3.3 local descriptor table ldts contain descriptors which are associated with a given task. generally, operating systems are de- signed so that each task has a separate ldt. the ldt may contain only code, data, stack, task gate, and call gate descriptors. ldts provide a mecha- nism for isolating a given task's code and data seg- ments from the rest of the operating system, while the gdt contains descriptors for segments which are common to all tasks. a segment cannot be ac- cessed by a task if its segment descriptor does not exist in either the current ldt or the gdt. this pro- vides both isolation and protection for a task's seg- ments, while still allowing global data to be shared among tasks. unlike the 6 byte gdt or idt registers which contain a base address and limit, the visible portion of the ldt register contains only a 16-bit selector. this se- lector refers to a local descriptor table descriptor in the gdt. 3.3.3.4 interrupt descriptor table the third table needed for military intel386 micro- processor systems is the interrupt descriptor table. (see figure 4-4.) the idt contains the descriptors which point to the location of up to 256 interrupt service routines. the idt may contain only task gates, interrupt gates, and trap gates. the idt should be at least 256 bytes in size in order to hold the descriptors for the 32 intel reserved interrupts. every interrupt used by a system must have an entry in the idt. the idt entries are referenced via int instructions, external interrupt vectors, and excep- tions. (see 1.9 interrupts). 271052 58 figure 4-4. interrupt descriptor table register use 3.3.4 descriptors 3.3.4.1 descriptor attribute bits the object to which the segment selector points to is called a descriptor. descriptors are eight byte quantities which contain attributes about a given re- gion of linear address space (i.e. a segment). these 31 0 byte segment base 1 5...0 segment limit 1 5...0 address 0 base 3 1...24 g d 0 0 limit p dpl s type a base a 4 19...16 23...16 base base address of the segment limit the length of the segment p present bit 1 e present 0 e not present dpl descriptor privilege level 0 3 s segment descriptor 0 e system descriptor 1 e code or data segment descriptor type type of segment a accessed bit g granularity bit 1 e segment length is page granular 0 e segment length is byte granular d default operation size (recognized in code segment descriptors only) 1 e 32-bit segment 0 e 16-bit segment 0 bit must be zero (0) for compatibility with future processors note: in a maximum-sized segment (i.e. a segment with g e 1 and segment limit 19..0 e fffffh), the lowest 12 bits of the segment base should be zero (i.e. segment base 11..0 e 000h). figure 4-5. segment descriptors 38
military intel386 tm microprocessor attributes include the 32-bit base linear address of the segment, the 20-bit length and granularity of the segment, the protection level, read, write or execute privileges, the default size of the operands (16-bit or 32-bit), and the type of segment. all of the attribute information about a segment is contained in 12 bits in the segment descriptor. figure 4-5 shows the gen- eral format of a descriptor. all segments on the mili- tary intel386 microprocessor have three attribute fields in common: the p bit, the dpl bit, and the s bit. the present p bit is 1 if the segment is loaded in physical memory, if p e 0 then any attempt to access this segment causes a not present exception (ex- ception 11). the descriptor privilege level dpl is a two-bit field which specifies the protection level 0 3 associated with a segment. the military intel386 processor has two main cate- gories of segments system segments and non-sys- tem segments (for code and data). the segment s bit in the segment descriptor determines if a given segment is a system segment or a code or data seg- ment. if the s bit is 1 then the segment is either a code or data segment, if it is 0 then the segment is a system segment. 3.3.4.2 intel386 tm code, data descriptors (s e 1) figure 4-6 shows the general format of a code and data descriptor and table 4-1 illustrates how the bits in the access rights byte are interpreted. 31 0 segment base 1 5...0 segment limit 1 5...0 0 limit access base base 3 1...24 g d 0 0 19...16 rights 23...16 a 4 byte d/b 1 e default instructions attributes are 32-bits 0 e default instruction attributes are 16-bits g granularity bit 1 e segment length is page granular 0 e segment length is byte granular 0 bit must be zero (0) for compatibility with future processors note: in a maximum-size segment (i.e., a segment with g e 1 and segment limit 19 . . . 0 e fffffh), the lowest 12 bits of the segment base should be zero (i.e., segment base 11 . . . 000 e 000h). figure 4-6. segment descriptors table 4-1. access rights byte definition for code and data descriptions bit name function position 7 present (p) p e 1 segment is mapped into physical memory. p e 0 no mapping to physical memory exits, base and limit are not used. 6 5 descriptor privilege segment privilege attribute used in privilege tests. level (dpl) 4 segment s e 1 code or data (includes stacks) segment descriptor descriptor (s) s e 0 system segment descriptor or gate descriptor 3 executable (e) e e 0 descriptor type is data segment: if 2 expansion direc- ed e 0 expand up segment, offsets must be s limit. data tion (ed) ed e 1 expand down segment, offsets must be l limit. segment 1 writeable (w) w e 0 data segment may not be written into. (s e 1, type w e 1 data segment may be written into. * e e 0) field 3 executable (e) e e 1 descriptor type is code segment: if definition 2 conforming (c) c e 1 code segment may only be executed code when cpl t dpl and cpl segment remains unchanged. (s e 1, 1 readable (r) r e 0 code segment may not be read. e e 1) r e 1 code segment may be read. * 0 accessed (a) a e 0 segment has not been accessed. a e 1 segment selector has been loaded into segment register or used by selector test instructions. 39
military intel386 tm microprocessor code and data segments have several descriptor fields in common. the accessed a bit is set whenev- er the processor accesses a descriptor. the a bit is used by operating systems to keep usage statistics on a given segment. the g bit, or granularity bit, specifies if a segment length is byte-granular or page-granular. m80386 segments can be one mega- byte long with byte granularity (g e 0) or four giga- bytes with page granularity (g e 1), (i.e., 2 20 pages each page is 4 kbytes in length). the granularity is totally unrelated to paging. a military intel386 micro- processor system can consist of segments with byte granularity, and page granularity, whether or not paging is enabled. the executable e bit tells if a segment is a code or data segment. a code segment (e e 1, s e 1) may be execute-only or execute/read as determined by the read r bit. code segments are execute only if r e 0, and execute/read if r e 1. code segments may never be written into. note: code segments may be modified via aliases. alias- es are writeable data segments which occupy the same range of linear address space as the code segment. the d bit indicates the default length for operands and effective addresses. if d e 1 then 32-bit oper- ands and 32-bit addressing modes are assumed. if d e 0 then 16-bit operands and 16-bit addressing modes are assumed. therefore all existing 286 code segments will execute on the military intel386 proc- essor assuming the d bit is set 0. another attribute of code segments is determined by the conforming c bit. conforming segments, c e 1, can be executed and shared by programs at differ- ent privilege levels. (see section 3.4 protection .) segments identified as data segments (e e 0, s e 1) are used for two types of military intel386 microproc- essor segments: stack and data segments. the ex- pansion direction (ed) bit specifies if a segment ex- pands downward (stack) or upward (data). if a seg- ment is a stack segment all offsets must be greater than the segment limit. on a data segment all off- sets must be less than or equal to the limit. in other words, stack segments start at the base linear ad- dress plus the maximum segment limit and grow down to the base linear address plus the limit. on the other hand, data segments start at the base lin- ear address and expand to the base linear address plus limit. the write w bit controls the ability to write into a segment. data segments are read-only if w e 0. the stack segment must have w e 1. the b bit controls the size of the stack pointer regis- ter. if b e 1, then pushes, pops, and calls all use the 32-bit esp register for stack references and as- sume an upper limit of ffffffffh. if b e 0, stack instructions all use the 16-bit sp register and as- sume an upper limit of ffffh. 3.3.4.3 system descriptor formats system segments describe information about oper- ating system tables, tasks, and gates. figure 4-7 shows the general format of system segment de- scriptors, and the various types of system segments. military intel386 processor system descriptors con- tain a 32-bit base linear address and a 20-bit seg- ment limit. m80286 system descriptors have a 24-bit base address and a 16-bit segment limit. m80286 system descriptors are identified by the upper 16 bits being all zero. 31 16 0 segment base 1 5...0 segment limit 1 5...0 0 base 3 1...24 g 0 0 0 limit p dpl 0 type base a 4 19...16 23...16 type defines 0 invalid 1 available 286 tss 2 ldt 3 busy 286 tss 4 286 call gate 5 task gate (for 286 or intel386 task) 6 286 interrupt gate 7 286 trap gate type defines 8 invalid 9 available intel386 tm tss a undefined (intel reserved) b busy intel386 tss c intel386 call gate d undefined (intel reserved) e intel386 interrupt gate f intel386 trap gate note: in a maximum-size segment (i.e., a segment with g e 1 and segment limit 19 . . . 0 e fffffh), the lowest 12 bits of the segment base should be zero (i.e., segment base 11 . . . 000 e 000h). figure 4-7. system segments descriptors 40
military intel386 tm microprocessor 3.3.4.4 ldt descriptors (s e 0, type e 2) ldt descriptors (s e 0 type e 2) contain informa- tion about local descriptor tables. ldts contain a table of segment descriptors, unique to a particular task. since the instruction to load the ldtr is only available at privilege level 0, the dpl field is ignored. ldt descriptors are only allowed in the global de- scriptor table (gdt). 3.3.4.5 tss descriptors (s e 0, type e 1, 3, 9, b) a task state segment (tss) descriptor contains in- formation about the location, size, and privilege level of a task state segment (tss). a tss in turn is a special fixed format segment which contains all the state information for a task and a linkage field to permit nesting tasks. the type field is used to indi- cate whether the task is currently busy (i.e. on a chain of active tasks) or the tss is available. the type field also indicates if the segment contains a 286 or a military intel386 microprocessor tss. the task register (tr) contains the selector which points to the current task state segment. 3.3.4.6 gate descriptors (s e 0, type e 47,c,f) gates are used to control access to entry points within the target code segment. the various types of gate descriptors are call gates, task gates, inter- rupt gates, and trap gates. gates provide a level of indirection between the source and destination of the control transfer. this indirection allows the proc- essor to automatically perform protection checks. it also allows system designers to control entry points to the operating system. call gates are used to change privilege levels (see section 3.4 protec- tion ), task gates are used to perform a task switch, and interrupt and trap gates are used to specify in- terrupt service routines. figure 4-8 shows the format of the four types of gate descriptors. call gates are primarily used to transfer program control to a more privileged level. the call gate descriptor consists of three fields: the access byte, a long pointer (selector and offset) which points to the start of a routine and a word count which specifies how many parameters are to be cop- ied from the caller's stack to the stack of the called routine. the word count field is only used by call gates when there is a change in the privilege level, other types of gates ignore the word count field. interrupt and trap gates use the destination selector and destination offset fields of the gate descriptor as a pointer to the start of the interrupt or trap handler routines. the difference between interrupt gates and trap gates is that the interrupt gate disables inter- rupts (resets the if bit) while the trap gate does not. 31 24 16 8 5 0 selector offset 1 5...0 0 word offset 3 1...16 p dpl 0 type 0 0 0 count a 4 4...0 gate descriptor fields name value description type 4 286 call gate 5 task gate (for 286 or military intel386 tm microprocessor task) 6 286 interrupt gate 7 286 trap gate c military intel386 microprocessor call gate e military intel386 microprocessor interrupt gate f military intel386 microprocessor trap gate p 0 descriptor contents are not valid 1 descriptor contents are valid dpleleast privileged level at which a task may access the gate. word count 0 31ethe number of parameters to copy from caller's stack to the called procedure's stack. the parameters are 32-bit quantities for military intel386 microprocessor gates, and 16-bit quantities for 286 gates. destination 16-bit selector to the target code segment selector selector or selector to the target task state segment for task gate destination offset entry point within the target code segment offset 16-bit 286 32-bit military intel386 microprocessor figure 4-8. gate descriptor formats 41
military intel386 tm microprocessor task gates are used to switch tasks. task gates may only refer to a task state segment (see section 3.4.6 task switching ) therefore only the destination selector portion of a task gate descriptor is used, and the destination offset is ignored. exception 13 is generated when a destination selec- tor does not refer to a correct descriptor type, i.e., a code segment for an interrupt, trap or call gate, a tss for a task gate. the access byte format is the same for all gate de- scriptors. p e 1 indicates that the gate contents are valid. p e 0 indicates the contents are not valid and causes exception 11 if referenced. dpl is the de- scriptor privilege level and specifies when this de- scriptor may be used by a task (see section 3.4 pro- tection ). the s field, bit 4 of the access rights byte, must be 0 to indicate a system control descriptor. the type field specifies the descriptor type as indi- cated in figure 4-8. 3.3.4.7 differences between military intel386 tm microprocessor and 286 descriptors in order to provide operating system compatibility between the m80286 and military intel386 proces- sor, the military intel386 processor supports all of the m80286 segment descriptors. figure 4-9 shows the general format of an m80286 system segment descriptor. the only differences between 286 and military intel386 processor descriptor formats are that the values of the type fields, and the limit and base address fields have been expanded for the mil- itary intel386 processor. the m80286 system seg- ment descriptors contained a 24-bit base address and 16-bit limit, while the military intel386 processor system segment descriptors have a 32-bit base ad- dress, a 20-bit limit field, and a granularity bit. by supporting m80286 system segments the military intel386 processor is able to execute 286 applica- tion programs on a military intel386 processor oper- ating system. this is possible because the proces- sor automatically understands which descriptors are 286 descriptors and which descriptors are military intel386 processor descriptors. in particular, if the upper word of a descriptor is zero, then that descrip- tor is a 286-style descriptor. the only other differences between 286-style de- scriptors and military intel386 processor descriptors is the interpretation of the word count field of call gates and the b bit. the word count field specifies the number of 16-bit quantities to copy for 286 call gates and 32-bit quantities for military intel386 proc- essor call gates. the b bit controls the size of pushes when using a call gate; if b e 0 pushes are 16 bits, if b e 1 pushes are 32 bits. 3.3.4.8 selector fields a selector in protected mode has three fields: local or global descriptor table indicator (ti), descriptor entry index (index), and requestor (the selector's) privilege level (rpl) as shown in figure 4-10. the ti bits select one of two memory-based tables of descriptors (the global descriptor table or the local descriptor table). the index selects one of 8k de- scriptors in the appropriate descriptor table. the rpl bits allow high speed testing of the selector's privilege attributes. 3.3.4.9 segment descriptor cache in addition to the selector value, every segment reg- ister has a segment descriptor cache register asso- ciated with it. whenever a segment register's con- tents are changed, the 8-byte descriptor associated with that selector is automatically loaded (cached) on the chip. once loaded, all references to that seg- ment use the cached descriptor information instead of reaccessing the descriptor. the contents of the descriptor cache are not visible to the programmer. since descriptor caches only change when a seg- ment register is changed, programs which modify the descriptor tables must reload the appropriate segment registers after changing a descriptor's val- ue. 31 0 segment base 1 5...0 segment limit 1 5...0 0 intel reserved p dpl s type base a 4 set to 0 23...16 base base address of the segment limit the length of the segment p present bit 1 e present 0 e not present dpl descriptor privilege level 0 3 s system descriptor 0 e system 1 e user type type of segment figure 4-9. 286 code and data segment descriptors 42
military intel386 tm microprocessor 271052 59 figure 4-10. example descriptor selection 43
military intel386 tm microprocessor 3.3.4.10 segment descriptor register settings the contents of the segment descriptor cache vary depending on the mode the m80386 is operating in. when operating in real address mode, the segment base, limit, and other attributes within the segment cache registers are defined as shown in figure 4-11. for compatiblity with the m8086 architecture, the base is set to sixteen times the current selector value, the limit is fixed at 0000ffffh, and the attri- butes are fixed so as to indicate the segment is pres- ent and fully usable. in real address mode, the in- ternal ``privilege level'' is always fixed to the highest level, level 0, so i/o and other privileged opcodes may be executed. 271052 60 * except the 32-bit cs base is initialized to fffff000h after reset until first intersegment control transfer (e.g. intersegment call, or intersegment jmp, or int). (see figure 4-13 example.) key: y e yes n e no 0 e privilege level 0 1 e privilege level 1 2 e privilege level 2 3 e privilege level 3 u e expand up d e expand down b e byte granularity p e page granularity w e push/pop 16-bit words f e push/pop 32-bit dwords e does not apply to that segment cache register figure 4-11. segment descriptor caches for real address mode (segment limit and attributes are fixed) 44
military intel386 tm microprocessor when operating in protected mode, the segment base, limit, and other attributes within the segment cache registers are defined as shown in figure 4-12. in protected mode, each of these fields are defined according to the contents of the segment descriptor indexed by the selector value loaded into the seg- ment register. 271052 61 key: y e fixed yes n e fixed no d e per segment descriptor p e per segment descriptor; descriptor must indicate ``present'' to avoid exception 11 (exception 12 in case of ss) r e per segment descriptor, but descriptor must indicate ``readable'' to avoid exception 13 (special case for ss) w e per segment descriptor, but descriptor must indicate ``writable'' to avoid exception 13 (special case for ss) e does not apply to that segment cache register figure 4-12. segment descriptor caches for protected mode (loaded per descriptor) 45
military intel386 tm microprocessor when operating in a virtual m8086 mode within the protected mode, the segment base, limit, and other attributes within the segment cache registers are de- fined as shown in figure 4-13. for compatibility with the m8086 architecture, the base is set to sixteen times the current selector value, the limit is fixed at 0000ffffh, and the attributes are fixed so as to indicate the segment is present and fully usable. the virtual program executes at lowest privilege level, level 3, to allow trapping of all iopl-sensitive in- structions and level-0-only instructions. 271052 62 key: y e yes n e no 0 e privilege level 0 1 e privilege level 1 2 e privilege level 2 3 e privilege level 3 u e expand up d e expand down b e byte granularity p e page granularity w e push/pop 16-bit words f e push/pop 32-bit dwords e does not apply to that segment cache register figure 4-13. segment descriptor caches for virtual m8086 mode within protected mode (segment limit and attributes are fixed) 46
military intel386 tm microprocessor 3.4 protection 3.4.1 protection concepts 271052 63 figure 4-14. four-level hierachical protection the military intel386 microprocessor has four levels of protection which are optimized to support the needs of a multi-tasking operating system to isolate and protect user programs from each other and the operating system. the privilege levels control the use of privileged instructions, i/o instructions, and access to segments and segment descriptors. un- like traditional microprocessor-based systems where this protection is achieved only through the use of complex external hardware and software the military intel386 processor provides the protection as part of its integrated memory management unit. the mili- tary intel386 processor offers an additional type of protection on a page basis, when paging is enabled (see section 3.5.3 page level protection ). the four-level hierarchical privilege system is illus- trated in figure 4-14. it is an extension of the user/ supervisor privilege mode commonly used by mini- computers and, in fact, the user/supervisor mode is fully supported by the military intel386 processor paging mechanism. the privilege levels (pl) are numbered 0 through 3. level 0 is the most privileged or trusted level. 3.4.2 rules of privilege the military intel386 processor controls access to both data and procedures between levels of a task, according to the following rules. # data stored in a segment with privilege level p can be accessed only by code executing at a privilege level at least as privileged as p . # a code segment/procedure with privilege level p can only be called by a task executing at the same or a lesser privilege level than p . 3.4.3 privilege levels 3.4.3.1 task privilege at any point in time, a task on the military intel386 processor always executes at one of the four privi- lege levels. the current privilege level (cpl) speci- fies the task's privilege level. a task's cpl may only be changed by control transfers through gate de- scriptors to a code segment with a different privilege level. (see section 3.4.4 privilege level transfers ) thus, an application program running at pl e 3 may call an operating system routine at pl e 1 (via a gate) which would cause the task's cpl to be set to 1 until the operating system routine was finished. 3.4.3.2 selector privilege (rpl) the privilege level of a selector is specified by the rpl field. the rpl is the two least significant bits of the selector. the selector's rpl is only used to es- tablish a less trusted privilege level than the current privilege level for the use of a segment. this level is called the task's effective privilege level (epl). the epl is defined as being the least privileged (i.e. nu- merically larger) level of a task's cpl and a selec- tor's rpl. thus, if selector's rpl e 0 then the cpl always specifies the privilege level for making an ac- cess using the selector. on the other hand if rpl e 3 then a selector can only access segments at level 3 regardless of the task's cpl. the rpl is most commonly used to verify that pointers passed to an operating system procedure do not access data that is of higher privilege than the procedure that origi- nated the pointer. since the originator of a selector can specify any rpl value, the adjust rpl (arpl) instruction is provided to force the rpl bits to the originator's cpl. 3.4.3.3 i/o privilege and i/o permission bitmap the i/o privilege level (iopl, a 2-bit field in the eflag register) defines the least privileged level at which i/o instructions can be unconditionally per- formed. i/o instructions can be unconditionally per- formed when cpl s iopl. (the i/o instructions are in, out, ins, outs, rep ins, and rep outs.) when cpl l iopl, and the current task is associat- ed with a 286 tss, attempted i/o instructions cause an exception 13 fault. when cpl l iopl, and the current task is associated with a military intel386 processor tss, the i/o permission bitmap (part of a military intel386 processor tss) is consulted on whether i/o to the port is allowed, or an exception 13 fault is to be generated instead. for diagrams of the i/o permission bitmap, refer to figures 4-15a and 4-15b. for further information on how the i/o 47
military intel386 tm microprocessor permission bitmap is used in protected mode or in virtual m8086 mode, refer to section 3.6.4 protec- tion and i/o permission bitmap. the i/o privilege level (iopl) also affects whether several other instructions can be executed or cause an exception 13 fault instead. these instructions are called ``iopl-sensitive'' instructions and they are cli and sti. (note that the lock prefix is not iopl- sensitive on the military intel386 processor.) the iopl also affects whether the if (interrupts en- able flag) bit can be changed by loading a value into the eflags register. when cpl s iopl, then the if bit can be changed by loading a new value into the eflags register. when cpl l iopl, the if bit cannot be changed by a new value pop'ed into (or otherwise loaded into) the eflags register; the if bit merely remains unchanged and no exception is generated. table 4-2. pointer test instructions instruction operands function arpl selector, adjust requested privi- register lege level: adjusts the rpl of the selector to the numeric maximum of current selector rpl value and the rpl value in the register. set zero flag if selector rpl was changed. verr selector verify for read: sets the zero flag if the segment referred to by the selector can be read. verw selector verify for write: sets the zero flag if the segment referred to by the selector can be written. lsl register, load segment limit: reads selector the segment limit into the register if privilege rules and descriptor type allow. set zero flag if successful. lar register, load access rights: reads selector the descriptor access rights byte into the register if privilege rules allow. set zero flag if successful. 3.4.3.4 privilege validation the military intel386 microprocessor provides sever- al instructions to speed pointer testing and help maintain system integrity by verifying that the selec- tor value refers to an appropriate segment. table 4- 2 summarizes the selector validation procedures available for the military intel386 microprocessor. this pointer verification prevents the common prob- lem of an application at pl e 3 calling a operating systems routine at pl e 0 and passing the operat- ing system routine a ``bad'' pointer which corrupts a data structure belonging to the operating system. if the operating system routine uses the arpl instruc- tion to ensure that the rpl of the selector has no greater privilege than that of the caller, then this problem can be avoided. 3.4.3.5 descriptor access there are basically two types of segment accesses: those involving code segments such as control transfers, and those involving data accesses. deter- mining the ability of a task to access a segment in- volves the type of segment to be accessed, the in- struction used, the type of descriptor used and cpl, rpl, and dpl as described above. any time an instruction loads data segment registers (ds, es, fs, gs) the military intel386 processor makes protection validation checks. selectors load- ed in the ds, es, fs, gs registers must refer only to data segments or readable code segments. the data access rules are specified in section 3.2.2 rules of privilege . the only exception to those rules is readable conforming code segments which can be accessed at any privilege level. finally the privilege validation checks are performed. the cpl is compared to the epl and if the epl is more privileged than the cpl an exception 13 (gen- eral protection fault) is generated. the rules regarding the stack segment are slightly different than those involving data segments. in- structions that load selectors into ss must refer to data segment descriptors for writeable data seg- ments. the dpl and rpl must equal the cpl. all other descriptor types or a privilege level violation will cause exception 13. a stack not present fault causes exception 12. note that an exception 11 is used for a not-present code or data segment. 3.4.4 privilege level transfers inter-segment control transfers occur when a selec- tor is loaded in the cs register. for a typical system most of these transfers are simply the result of a call 48
military intel386 tm microprocessor table 4-3. descriptor types used for control transfer control transfer types operation types descriptor descriptor referenced table intersegment within the same privilege level jmp, call, ret, iret * code segment gdt/ldt intersegment to the same or higher privilege level call call gate gdt/ldt interrupt within task may change cpl interrupt instruction, trap or idt exception, external interrupt interrupt gate intersegment to a lower privilege level ret, iret * code segment gdt/ldt (changes task cpl) call, jmp task state gdt segment task switch call, jmp task gate gdt/ldt iret ** task gate idt interrupt instruction, exception, external interrupt * nt (nested task bit of flag register) e 0 ** nt (nested task bit of flag register) e 1 or a jump to another routine. there are five types of control transfers which are summarized in table 4-3. many of these transfers result in a privilege level transfer. changing privilege levels is done only via control transfers, by using gates, task switches, and interrupt or trap gates. control transfers can only occur if the operation which loaded the selector references the correct de- scriptor type. any violation of these descriptor usage rules will cause an exception 13 (e.g. jmp through a call gate, or iret from a normal subroutine call). in order to provide further system security, all control transfers are also subject to the privilege rules. the privilege rules require that: e privilege level transitions can only occur via gates. e jmps can be made to a non-conforming code segment with the same privilege or to a conform- ing code segment with greater or equal privilege. e calls can be made to a non-conforming code segment with the same privilege or via a gate to a more privileged level. e interrupts handled within the task obey the same privilege rules as calls. e conforming code segments are accessible by privilege levels which are the same or less privi- leged than the conforming-code segment's dpl. e both the requested privilege level (rpl) in the selector pointing to the gate and the task's cpl must be of equal or greater privilege than the gate's dpl. e the code segment selected in the gate must be the same or more privileged than the task's cpl. e return instructions that do not switch tasks can only return control to a code segment with same or less privilege. e task switches can be performed by a call, jmp, or int which references either a task gate or task state segment who's dpl is less privi- leged or the same privilege as the old task's cpl. any control transfer that changes cpl within a task causes a change of stacks as a result of the privi- lege level change. the initial values of ss:esp for privilege levels 0, 1, and 2 are retained in the task state segment (see section 3.4.6 task switching ). during a jmp or call control transfer, the new stack pointer is loaded into the ss and esp regis- ters and the previous stack pointer is pushed onto the new stack. when returning to the original privilege level, use of the lower-privileged stack is restored as part of the ret or iret instruction operation. for subrou- tine calls that pass parameters on the stack and cross privilege levels, a fixed number of words (as specified in the gate's word count field) are copied from the previous stack to the current stack. the inter-segment ret instruction with a stack adjust- ment value will correctly restore the previous stack pointer upon return. 49
military intel386 tm microprocessor type e 9: available military intel386 microprocessor tss 271052 64 type e b: busy military intel386 microprocessor tss figure 4-15a. military intel386 tm microprocessor tss and tss registers 50
military intel386 tm microprocessor 271052 71 i/o ports accessible: 2 x 9, 12, 13, 15, 20 x 24, 27, 33, 34, 40, 41, 48, 50, 52, 53, 58 x 60, 62, 63, 96 x 127 figure 4-15b. sample i/o permission bit map 3.4.5 call gates gates provide protected, indirect calls. one of the major uses of gates is to provide a secure method of privilege transfers within a task. since the operating system defines all of the gates in a system, it can ensure that all gates only allow entry into a few trust- ed procedures (such as those which allocate memo- ry, or perform i/o). gate descriptors follow the data access rules of priv- ilege; that is, gates can be accessed by a task if the epl, is equal to or more privileged than the gate descriptor's dpl. gates follow the control transfer rules of privilege and therefore may only transfer control to a more privileged level. call gates are accessed via a call instruction and are syntactically identical to calling a normal subrou- tine. when an inter-level military intel386 processor call gate is activated, the following actions occur. 1. load cs:eip from gate check for validity 2. ss is pushed zero-extended to 32 bits 3. esp is pushed 4. copy word count 32-bit parameters from the old stack to the new stack 5. push return address on stack the procedure is identical for 286 call gates, except that 16-bit parameters are copied and 16-bit regis- ters are pushed. interrupt gates and trap gates work in a similar fashion as the call gates, except there is no copying of parameters. the only difference between trap and interrupt gates is that control transfers through an interrupt gate disable further interrupts (i.e. the if bit is set to 0), and trap gates leave the interrupt status unchanged. 3.4.6 task switching a very important attribute of any multi-tasking/multi- user operating systems is its ability to rapidly switch between tasks or processes. the military intel386 processor directly supports this operation by provid- ing a task switch instruction in hardware. the military intel386 processor task switch operation saves the entire state of the machine (all of the registers, ad- dress space, and a link to the previous task), loads a new execution state, performs protection checks, and commences execution in the new task, in about 17 microseconds. like transfer of control via gates, the task switch operation is invoked by executing an inter-segment jmp or call instruction which refers to a task state segment (tss), or a task gate de- scriptor in the gdt or ldt. an int n instruction, exception, trap, or external interrupt may also invoke the task switch operation if there is a task gate de- scriptor in the associated idt descriptor slot. the tss descriptor points to a segment (see figure 4-15) containing the entire military intel386 proces- sor execution state while a task gate descriptor con- tains a tss selector. the military intel386 processor supports both 286 and military intel386 processor style tsss. figure 4-16 shows a 286 tss. the limit of a intel386 tss must be greater than 0064h (002bh for a 286 tss), and can be as large as 4 gigabytes. in the additional tss space, the operat- ing system is free to store additional information such as the reason the task is inactive, time the task has spent running, and open files belong to the task. each task must have a tss associated with it. the current tss is identified by a special register in the military intel386 processor called the task state segment register (tr). this register contains a se- lector referring to the task state segment descriptor that defines the current tss. a hidden base and limit register associated with tr are loaded whenever tr is loaded with a new selector. returning from a task is accomplished by the iret instruction. when iret is executed, control is returned to the task which was interrupted. the current executing task's state is saved in the tss and the old task state is restored from its tss. several bits in the flag register and machine status word (cr0) give information about the state of a task which are useful to the operating system. the nested task (nt) (bit 14 in eflags) controls the function of the iret instruction. if nt e 0, the iret instruction performs the regular return; when nt e 1, iret performs a task switch operation back to the previous task. the nt bit is set or reset in the follow- ing fashion: 51
military intel386 tm microprocessor 271052 65 figure 4-16. 286 tss when a call or int instruction initiates a task switch, the new tss will be marked busy and the back link field of the new tss set to the old tss selector. the nt bit of the new task is set by call or int initiated task switches. an interrupt that does not cause a task switch will clear nt. (the nt bit will be restored after execution of the interrupt handler) nt may also be set or cleared by popf or iret instructions. the military intel386 processor task state segment is marked busy by changing the descriptor type field from type 9h to type bh. a 286 tss is marked busy by changing the descriptor type field from type 1 to type 3. use of a selector that references a busy task state segment causes an exception 13. the virtual mode (vm) bit 17 is used to indicate if a task, is a virtual m8086 task. if vm e 1, then the tasks will use the real mode addressing mecha- nism. the virtual m8086 environment is only entered and exited via a task switch (see section 3.6 virtual mode ). the coprocessor's state is not automatically saved when a task switch occurs, because the incoming task may not use the coprocessor. the task switched (ts) bit (bit 3 in the cr0) helps deal with the coprocessor's state in a multi-tasking environ- ment. whenever the military intel386 processor switches tasks, it sets the ts bit. the military intel386 processor detects the first use of a proces- sor extension instruction after a task switch and causes the processor extension not available excep- tion 7. the exception handler for exception 7 may then decide whether to save the state of the co- processor. a processor extension not present ex- ception (7) will occur when attempting to execute an esc or wait instruction if the task switched and monitor coprocessor extension bits are both set (i.e. ts e 1 and mp e 1). the t bit in the military intel386 processor tss indi- cates that the processor should generate a debug exception when switching to a task. if t e 1 then upon entry to a new task a debug exception 1 will be generated. 3.4.7 initialization and transition to protected mode since the military intel386 processor begins execut- ing in real mode immediately after reset it is nec- essary to initialize the system tables and registers with the appropriate values. the gdt and idt registers must refer to a valid gdt and idt. the idt should be at least 256 bytes long, and gdt must contain descriptors for the initial code, and data segments. figure 4-17 shows the tables and figure 4-18 the descriptors needed for a simple protected mode military intel386 processor system. it has a single code and single data/stack segment each four gigabytes long and a single privi- lege level pl e 0. the actual method of enabling protected mode is to load cr0 with the pe bit set, via the mov cr0, r/m instruction. this puts the military intel386 processor in protected mode. after enabling protected mode, the next instruction should execute an intersegment jmp to load the cs register and flush the instruction decode queue. the final step is to load all of the data segment registers with the initial selector values. an alternate approach to entering protected mode which is especially appropriate for multi-tasking op- erating systems, is to use the built in task-switch to load all of the registers. in this case the gdt would contain two tss descriptors in addition to the code and data descriptors needed for the first task. the first jmp instruction in protected mode would jump to the tss causing a task switch and loading all of the registers with the values stored in the tss. the task state segment register should be initialized to point to a valid tss descriptor since a task switch saves the state of the current task in a task state segment. 52
military intel386 tm microprocessor 271052 66 figure 4-17. simple protected system data segment base 1 5...0 segment limit 1 5...0 descriptor 0118 (h) ffff (h) base 3 1...24 g d limit base 2 3...16 2 00 (h) 1 1 0 0 19.16 1 0 0 1 0 0 1 0 00 (h) f (h) code segment base 1 5...0 segment limit 1 5...0 descriptor 0118 (h) ffff (h) base 3 1...24 g d limit base 2 3...16 1 00 (h) 1 1 0 0 19.16 1 0 0 1 1 0 1 0 00 (h) f (h) null descriptor 0 31 24 16 15 8 0 figure 4-18. gdt descriptors for simple system 3.4.8 tools for building protected systems in order to simplify the design of a protected multi- tasking system, intel provides a tool which allows the system designer an easy method of constructing the data structures needed for a protected mode military intel386 processor system. this tool is the builder bld-386. bld-386 lets the operating system writer specify all of the segment descriptors dis- cussed in the previous sections (ldts, idts, gdts, gates, and tsss) in a high-level language. 3.5 paging 3.5.1 paging concepts paging is another type of memory management useful for virtual memory multitasking operating sys- tems. unlike segmentation which modularizes pro- grams and data into variable length segments, pa- 53
military intel386 tm microprocessor ging divides programs into multiple uniform size pages. pages bear no direct relation to the logical structure of a program. while segment selectors can be considered the logical ``name'' of a program module or data structure, a page most likely corre- sponds to only a portion of a module or data struc- ture. by taking advantage of the locality of reference dis- played by most programs, only a small number of pages from each active task need be in memory at any one moment. 3.5.2 paging organization 3.5.2.1 page mechanism the military intel386 processor uses two levels of tables to translate the linear address (from the seg- mentation unit) into a physical address. there are three components to the paging mechanism of the military intel386 processor: the page directory, the page tables, and the page itself (page frame). all memory-resident elements of the military intel386 processor paging mechanism are the same size, namely, 4 kbytes. a uniform size for all of the ele- ments simplifies memory allocation and reallocation schemes, since there is no problem with memory fragmentation. figure 4-19 shows how the paging mechanism works. 3.5.2.2 page descriptor base register cr2 is the page fault linear address register. it holds the 32-bit linear address which caused the last page fault detected. cr3 is the page directory physical base address register. it contains the physical starting address of the page directory. the lower 12 bits of cr3 are always zero to ensure that the page directory is al- ways page aligned. loading it via a mov cr3, reg instruction causes the page table entry cache to be flushed, as will a task switch through a tss which changes the value of cr0. (see 3.5.4 translation lookaside buffer ). 3.5.2.3 page directory the page directory is 4 kbytes long and allows up to 1024 page directory entries. each page directory entry contains the address of the next level of ta- bles, the page tables and information about the page table. the contents of a page directory entry are shown in figure 4-20. the upper 10 bits of the linear address (a22 a31) are used as an index to select the correct page directory entry. 271052 67 figure 4-19. paging mechanism 31 1211 10 9876543210 os u r page table address 31..12 reserved 0 0 d a 0 0 e e p sw figure 4-20. page directory entry (points to page table) 54
military intel386 tm microprocessor 31 1211 10 9876543210 os u r page frame address 31..12 reserved 0 0 d a 0 0 e e p sw figure 4-21. page table entry (points to page) 3.5.2.4 page tables each page table is 4 kbytes and holds up to 1024 page table entries. page table entries contain the starting address of the page frame and statistical information about the page (see figure 4-21). ad- dress bits a12 a21 are used as an index to select one of the 1024 page table entries. the 20 upper- bit page frame address is concatenated with the lower 12 bits of the linear address to form the physi- cal address. page tables can be shared between tasks and swapped to disks. 3.5.2.5 page directory/table entries the lower 12 bits of the page table entries and page directory entries contain statistical information about pages and page tables respectively. the p (present) bit 0 indicates if a page directory or page table entry can be used in address translation. if p e 1 the entry can be used for address translation if p e 0 the entry can not be used for translation, and all of the other bits are available for use by the software. for example the remaining 31 bits could be used to indicate where on the disk the page is stored. the a (accessed) bit 5, is set by the m80386 for both types of entries before a read or write access occurs to an address covered by the entry. the d (dirty) bit 6 is set to 1 before a write to an address covered by that page table entry occurs. the d bit is undefined for page directory entries. when the p, a and d bits are updated by the military intel386 proc- essor, it generates a read-modify-write cycle which locks the bus and prevents conflicts with other proc- essors or perpherials. software which modifies these bits should use the lock prefix to ensure the integrity of the page tables in multi-master systems. the 3 bits marked os reserved in figure 4-20 and figure 4-21 (bits 9 11) are software definable. oss are free to use these bits for whatever purpose they wish. an example use of the os reserved bits would be to store information about page aging. by keeping track of how long a page has been in mem- ory since being accessed, an operating system can implement a page replacement algorithm like least recently used. the (user/supervisor) u/s bit 2 and the (read/ write) r/w bit 1 are used to provide protection attri- butes for individual pages. 3.5.3 page level protection (r/w, u/s bits) the military intel386 processor provides a set of protection attributes for paging systems. the paging mechanism distinguishes between two levels of pro- tection: user which corresponds to level 3 of the segmentation based protection, and supervisor which encompasses all of the other protection levels (0, 1, 2). programs executing at level 0, 1 or 2 by- pass the page protection, although segmentation based protection is still enforced by the hardware. the u/s and r/w bits are used to provide us- er/supervisor and read/write protection for individ- ual pages or for all pages covered by a page table directory entry. the u/s and r/w bits in the first level page directory table apply to all pages de- scribed by the page table pointed to by that directory entry. the u/s and r/w bits in the second level page table entry apply only to the page described by that entry. the u/s and r/w bits for a given page are obtained by taking the most restrictive of the u/s and r/w from the page directory table entries and the page table entries and using these bits to address the page. example: if the u/s and r/w bits for the page di- rectory entry were 10 and the u/s and r/w bits for the page table entry were 01, the access rights for the page would be 01, the numerically smaller of the two. table 4-4 shows the affect of the u/s and r/w bits on accessing memory. table 4-4. protection provided by r/w and u/s u/s r/w permitted permitted access level 3 levels 0, 1, or 2 0 0 none read/write 0 1 none read/write 1 0 read-only read/write 1 1 read/write read/write however a given segment can be easily made read- only for level 0, 1, or 2 via the use of segmented protection mechanisms. (section 3.4 protection ). 55
military intel386 tm microprocessor 3.5.4 translation lookaside buffer the military intel386 processor paging hardware is designed to support demand paged virtual memory systems. however, performance would degrade substantially if the processor was required to access two levels of tables for every memory reference. to solve this problem, the military intel386 processor keeps a cache of the most recently accessed pages, this cache is called the translation lookaside buffer (tlb). the tlb is a four-way set associative 32-en- try page table cache. it automatically keeps the most commonly used page table entries in the proces- sor. the 32-entry tlb coupled with a 4k page size, results in coverage of 128 kbytes of memory ad- dresses. for many common multi-tasking systems, the tlb will have a hit rate of about 98%. this means that the processor will only have to access the two-level page structure on 2% of all memory references. figure 4-22 illustrates how the tlb com- plements the military intel386 processor's paging mechanism. 3.5.5 paging operation 271052 68 figure 4-22. translation lookaside buffer the paging hardware operates in the following fash- ion. the paging unit hardware receives a 32-bit lin- ear address from the segmentation unit. the upper 20 linear address bits are compared with all 32 en- tries in the tlb to determine if there is a match. if there is a match (i.e. a tlb hit), then the 32-bit phys- ical address is calculated and will be placed on the address bus. however, if the page table entry is not in the tlb, the military intel386 microprocessor will read the ap- propriate page directory entry. if p e 1 on the page directory entry indicating that the page table is in memory, then the military intel386 processor will read the appropriate page table entry and set the access bit. if p e 1 on the page table entry indicat- ing that the page is in memory, the military intel386 processor will update the access and dirty bits as needed and fetch the operand. the upper 20 bits of the linear address, read from the page table, will be stored in the tlb for future accesses. however, if p e 0 for either the page directory entry or the page table entry, then the processor will generate a page fault, an exception 14. the processor will also generate an exception 14, page fault, if the memory reference violated the page protection attributes (i.e. u/s or r/w) (e.g. try- ing to write to a read-only page). cr2 will hold the linear address which caused the page fault. if a sec- ond page fault occurs, while the processor is at- tempting to enter the service routine for the first, then the processor will invoke the page fault (ex- ception 14) handler a second time, rather than the double fault (exception 8) handler. since exception 14 is classified as a fault, cs: eip will point to the instruction causing the page fault. the 16-bit error code pushed as part of the page fault handler will contain status bits which indicate the cause of the page fault. the 16-bit error code is used by the operating sys- tem to determine how to handle the page fault fig- ure 4-23a shows the format of the page-fault error code and the interpretation of the bits. note: even though the bits in the error code (u/s, w/r, and p) have similar names as the bits in the page directory/table entries, the interpretation of the er- ror code bits is different. figure 4-23b indicates what type of access caused the page fault. 15 3210 u uuuuuuuuuuuuuu wp sr figure 4-23a. page fault error code format u/s : the u/s bit indicates whether the access causing the fault occurred when the processor was executing in user mode (u/s e 1) or in supervisor mode (u/s e 0) w/r : the w/r bit indicates whether the access causing the fault was a read (w/r e 0) or a write (w/r e 1) p : the p bit indicates whether a page fault was caused by a not-present page (p e 0), or by a page level protection violation (p e 1) u : undefined 56
military intel386 tm microprocessor u/s w/r access type 0 0 supervisor * read 0 1 supervisor write 1 0 user read 1 1 user write * descriptor table access will fault with u/s e 0, even if the program is executing at level 3. figure 4-23b. type of access causing page fault 3.5.6 operating system responsibilities the military intel386 processor takes care of the page address translation process, relieving the bur- den from an operating system in a demand-paged system. the operating system is responsible for set- ting up the initial page tables, and handling any page faults. the operating system also is required to inval- idate (i.e. flush) the tlb when any changes are made to any of the page table entries. the operating system must reload cr3 to cause the tlb to be flushed. setting up the tables is simply a matter of loading cr3 with the address of the page directory, and allocating space for the page directory and the page tables. the primary responsibility of the oper- ating system is to implement a swapping policy and handle all of the page faults. a final concern of the operating system is to ensure that the tlb cache matches the information in the paging tables. in particular, any time the operating system sets the p present bit of page table entry to zero, the tlb must be flushed. operating systems may want to take advantage of the fact that cr3 is stored as part of a tss, to give every task or group of tasks its own set of page tables. 3.6 virtual m8086 environment 3.6.1 executing m8086 programs the military intel386 processor allows the execution of m8086 application programs in both real mode and in the virtual m8086 mode (virtual mode). of the two methods, virtual m8086 mode offers the system designer the most flexibility. the virtual m8086 mode allows the execution of m8086 appli- cations, while still allowing the system designer to take full advantage of the military intel386 processor protection mechanism. in particular, the military intel386 processor allows the simultaneous execu- tion of m8086 operating systems and its applica- tions, and a military intel386 processor operating system and both m80286 and military intel386 proc- essor applications. thus, in a multi-user military intel386 processor computer, one person could be running an ms-dos spreadsheet, another person using ms-dos, and a third person could be running multiple unix utilities and applications. each person in this scenario would believe that he had the com- puter completely to himself. figure 4-24 illustrates this concept. 3.6.2 virtual m8086 mode addressing mechanism one of the major differences between military intel386 processor real and protected modes is how the segment selectors are interpreted. when the processor is executing in virtual m8086 mode the segment registers are used in an identical fash- ion to real mode. the contents of the segment reg- ister is shifted left 4 bits and added to the offset to form the segment base linear address. the military intel386 processor allows the operating system to specify which programs use the m8086 style address mechanism, and which programs use protected mode addressing, on a per task basis. through the use of paging, the one megabyte ad- dress space of the virtual mode task can be mapped to anywhere in the 4 gigabyte linear address space of the military intel386 processor. like real mode, virtual mode effective addresses (i.e., segment off- sets) that exceed 64 kbyte will cause an exception 13. however, these restrictions should not prove to be important, because most tasks running in virtual m8086 mode will simply be existing m8086 applica- tion programs. 3.6.3 paging in virtual mode the paging hardware allows the concurrent running of multiple virtual mode tasks, and provides protec- tion and operating system isolation. although it is not strictly necessary to have the paging hardware enabled to run virtual mode tasks, it is needed in order to run multiple virtual mode tasks or to relo- cate the address space of a virtual mode task to physical address space greater than one megabyte. the paging hardware allows the 20-bit linear ad- dress produced by a virtual mode program to be divided into up to 256 pages. each one of the pages can be located anywhere within the maximum 4 giga- byte physical address space of the military intel386 processor. in addition, since cr3 (the page directo- ry base register) is loaded by a task switch, each virtual mode task can use a different mapping scheme to map pages to different physical locations. finally, the paging hardware allows the sharing of the m8086 operating system code between multiple 57
military intel386 tm microprocessor 271052 69 figure 4-24. virtual m8086 environment memory management m8086 applications. figure 4-24 shows how the mili- tary intel386 processor paging hardware enables multiple m8086 programs to run under a virtual memory demand paged system. 3.6.4 protection and i/o permission bitmap all virtual m8086 mode programs execute at privi- lege level 3, the level of least privilege. as such, virtual m8086 mode programs are subject to all of the protection checks defined in protected mode. (this is different from real mode which implicitly is executing at privilege level 0, the level of greatest privilege.) thus, an attempt to execute a privileged instruction when in virtual m8086 mode will cause an exception 13 fault. the following are privileged instructions, which may be executed only at privilege level 0. therefore, at- tempting to execute these instructions in virtual m8086 mode (or anytime cpl l 0) causes an ex- ception 13 fault: lidt; mov drn,reg; mov reg,drn; lgdt; mov trn,reg; mov reg,trn; lmsw; mov crn,reg; mov reg,crn. clts; hlt; several instructions, particularly those applying to the multitasking model and protection model, are available only in protected mode. therefore, at- tempting to execute the following instructions in real mode or in virtual m8086 mode generates an exception 6 fault: ltr; str; lldt; sldt; lar; verr; lsl; verw; arpl. the instructions which are iopl-sensitive in protect- ed mode are: in; sti; out; cli ins; outs; rep ins; rep outs; 58
military intel386 tm microprocessor in virtual m8086 mode, a slightly different set of in- structions are made iopl-sensitive. the following in- structions are iopl-sensitive in virtual m8086 mode: int n; sti; pushf; cli; popf; iret the pushf, popf, and iret instructions are iopl- sensitive in virtual m8086 mode only. this provision allows the if flag (interrupt enable flag) to be virtual- ized to the virtual m8086 mode program. the int n software interrupt instruction is also iopl-sensitive in virtual m8086 mode. note, however, that the int 3 (opcode 0cch), into, and bound instructions are not iopl-sensitive in virtual m8086 mode (they aren't iopl sensitive in protected mode either). note that the i/o instructions (in, out, ins, outs, rep ins, and rep outs) are not iopl-sensitive in virtual m8086 mode. rather, the i/o instructions be- come automatically sensitive to the i/o permission bitmap contained in the military intel386 proces- sor task state segment . the i/o permission bit- map, automatically used by the military intel386 processor in virtual m8086 mode, is illustrated by figures 4-15a and 4-15b. the i/o permission bitmap can be viewed as a 0 64 kbit bit string, which begins in memory at offset bit e map e offset in the current tss. the 16-bit pointer bit e map e offset (15:0) is found in the word beginning at offset 66h (102 decimal) from the tss base, as shown in figure 4-15a. each bit in the i/o permission bitmap corresponds to a single byte-wide i/o port, as illustrated in figure 4-15a. if a bit is 0, i/o to the corresponding byte- wide port can occur without generating an excep- tion. otherwise the i/o instruction causes an excep- tion 13 fault. since every byte-wide i/o port must be protectable, all bits corresponding to a word-wide or dword-wide port must be 0 for the word-wide or dword-wide i/o to be permitted. if all the referenced bits are 0, the i/o will be allowed. if any referenced bits are 1, the attempted i/o will cause an exception 13 fault. due to the use of a pointer to the base of the i/o permission bitmap, the bitmap may be located any- where within the tss, or may be ignored completely by pointing the bit e map e offset (15:0) beyond the limit of the tss segment. in the same manner, only a small portion of the 64k i/o space need have an associated map bit, by adjusting the tss limit to truncate the bitmap. this eliminates the commitment of 8k of memory when a complete bitmap is not required, while allowing the fully general case if de- sired. example of bitmap for i/o ports 0 255: setting the tss limit to bit e map e offset a 31 a 1 ** [ ** see note below ] will allow a 32-byte bit- map for the i/o ports y 0 255, plus a terminator byte of all 1's [ ** see note below ] . this allows the i/o bitmap to control i/o permission to i/o port 0 255 while causing an exception 13 fault on attempt- ed i/o to any i/o port 256 through 65,565. ** important implementation note: beyond the last byte of i/o mapping information in the i/o permission bitmap must be a byte containing all 1's. the byte of all 1's must be within the limit of the intel386 tss segment (see figure 4-15a). 3.6.5 interrupt handling in order to fully support the emulation of an m8086 machine, interrupts in virtual m8086 mode are han- dled in a unique fashion. when running in virtual mode all interrupts and exceptions involve a privi- lege change back to the host military intel386 proc- essor operating system. the military intel386 proc- essor operating system determines if the interrupt comes from a protected mode application or from a virtual mode program by examining the vm bit in the eflags image stored on the stack. when a virtual mode program is interrupted and ex- ecution passes to the interrupt routine at level 0, the vm bit is cleared. however, the vm bit is still set in the eflag image on the stack. the military intel386 processor operating system in turn handles the exception or interrupt and then re- turns control to the m8086 program. the military intel386 processor operating system may choose to let the m8086 operating system handle the interrupt or it may emulate the function of the interrupt han- dler. for example, many m8086 operating system calls are accessed by pushing parameters on the stack, and then executing an int n instruction. if the iopl is set to 0 then all int n instructions will be intercepted by the military intel386 processor oper- ating system. the military intel386 processor operat- ing system could emulate the m8086 operating sys- tem's call. figure 4-25 shows how the military intel386 processor operating system could intercept an m8086 operating system's call to ``open a file''. a military intel386 processor operating system can provide a virtual m8086 environment which is totally transparent to the application software via intercept- ing and then emulating m8086 operating system's calls, and intercepting in and out instructions. 3.6.6 entering and leaving virtual m8086 mode virtual m8086 mode is entered by executing an iret instruction (at cpl e 0), or task switch (at any cpl) to a military intel386 processor task whose military intel386 processor tss has a flags image con- 59
military intel386 tm microprocessor taining a 1 in the vm bit position while the processor is executing in protected mode. that is, one way to enter virtual m8086 mode is to switch to a task with a military intel386 processor tss that ha sa1inthe vm bit in the eflags image. the other way is to execute a 32-bit iret instruction at privilege level 0, where the stack has a 1 in the vm bit in the eflags image. popf does not affect the vm bit, even if the processor is in protected mode or level 0, and so cannot be used to enter virtual m8086 mode. pushf always pushes a 0 in the vm bit, even if the processor is in virtual m8086 mode, so that a pro- gram cannot tell if it is executing in real mode, or in virtual m8086 mode. the vm bit can be set by executing an iret instruc- tion only at privilege level 0, or by any instruction or interrupt which causes a task switch in protected mode (with vm e 1 in the new flags image), and can be cleared only by an interrupt or exception in virtual m8086 mode. iret and popf instructions executed in real mode or virtual m8086 mode will not change the value in the vm bit. the transition out of virtual m8086 mode to military intel386 processor protected mode occurs only on receipt of an interrupt or exception (such as due to a sensitive instruction). in virtual m8086 mode, all in- terrupts and exceptions vector through the protect- ed mode idt, and enter an interrupt handler in pro- tected military intel386 processor mode. that is, as part of interrupt processing, the vm bit is cleared. because the matching iret must occur from level 0, if an interrupt or trap gate is used to field an inter- rupt or exception out of virtual m8086 mode, the gate must perform an inter-level interrupt only to level 0. interrupt or trap gates through conforming segments, or through segments with dpl l 0, will raise a gp fault with the cs selector as the error code. 3.6.6.1 task switches to/from virtual m8086 mode tasks which can execute in virtual m8086 mode must be described by a tss with the new military intel386 processor format (type 9 or 11 descriptor). a task switch out of virtual m8086 mode will operate exactly the same as any other task switch out of a task with a military intel386 processor tss. all of the programmer visible state, including the flags reg- ister with the vm bit set to 1, is stored in the tss. the segment registers in the tss will contain m8086 segment base values rather than selectors. a task switch into a task described by a military intel386 processor tss will have an additional check to determine if the incoming task should be resumed in virtual m8086 mode. tasks described by 286 format tsss cannot be resumed in virtual m8086 mode, so no check is required there (the flags image in 286 format tss has only the low order 16 flags bits). before loading the segment register images from a military intel386 processor tss, the flags image is loaded, so that the seg- ment registers are loaded from the tss image as m8086 segment base values. the task is now ready to resume in virtual m8086 execution mode. 3.6.6.2 transitions through trap and interrupt gates, and iret a task switch is one way to enter or exit virtual m8086 mode. the other method is to exit through a trap or interrupt gate, as part of handling an inter- rupt, and to enter as part of executing an iret instruction. the transition out must use a military intel386 processor trap gate (type 14), or military intel386 processor interrupt gate (type 15), which must point to a non-conforming level 0 segment (dpl e 0) in order to permit the trap handler to iret back to the virtual m8086 program. the gate must point to a non-conforming level 0 segment to per- form a level switch to level 0 so that the matching iret can change the vm bit. military intel386 proc- essor gates must be used, since 286 gates save only the low 16 bits of the flags register, so that the vm bit will not be saved on transitions through the 286 gates. also, the 16-bit iret (presumably) used to terminate the 286 interrupt handler will pop only the lower 16 bits from flags, and will not af- fect the vm bit. the action taken for a military intel386 processor trap or interrupt gate if an inter- rupt occurs while the task is executing in virtual m8086 mode is given by the following sequence. (1) save the flags register in a temp to push later. turn off the vm and tf bits, and if the interrupt is serviced by an interrupt gate, turn off if also. (2) interrupt and trap gates must perform a level switch from 3 (where the vm86 program exe- cutes) to level 0 (so iret can return). this pro- cess involves a stack switch to the stack given in the tss for privilege level 0. save the virtual m8086 mode ss and esp registers to push in a later step. the segment register load of ss will be done as a protected mode segment load, since the vm bit was turned off above. (3) push the m8086 segment register values onto the new stack, in the order: gs, fs, ds, es. these are pushed as 32-bit quantities, with unde- fined values in the upper 16 bits. then load these 4 registers with null selectors (0). (4) push the old m8086 stack pointer onto the new stack by pushing the ss register (as 32-bits, high bits undefined), then pushing the 32-bit esp reg- ister saved above. (5) push the 32-bit flags register saved in step 1. (6) push the old m8086 instruction pointer onto the new stack by pushing the cs register (as 32-bits, high bits undefined), then pushing the 32-bit eip register. 60
military intel386 tm microprocessor 271052 70 m8086 application makes ``open file call'' x causes general protection fault (arrow y 1) virtual m8086 monitor intercepts call. calls military intel386 processor os (arrow y 2) military intel386 processor os opens file returns control to m8086 os (arrow y 3) m8086 os returns control to application. (arrow y 4) transparent to application figure 4-25. virtual m8086 environment interrupt and call handling (7) load up the new cs:eip value from the interrupt gate, and begin execution of the interrupt routine in protected military intel386 processor mode. the transition out of virtual m8086 mode performs a level change and stack switch, in addition to chang- ing back to protected mode. in addition, all of the m8086 segment register images are stored on the stack (behind the ss:esp image), and then loaded with null (0) selectors before entering the interrupt handler. this will permit the handler to safely save and restore the ds, es, fs, and gs registers as 286 selectors. this is needed so that interrupt handlers which don't care about the mode of the interrupted program can use the same prolog and epilog code for state saving (i.e. push all registers in prolog, pop all in epilog) regardless of whether or not a ``native'' mode or virtual m8086 mode program was interrupt- ed. restoring null selectors to these registers before executing the iret will not cause a trap in the inter- rupt handler. interrupt routines which expect values in the segment registers, or return values in segment registers will have to obtain/return values from the m8086 register images pushed onto the new stack. they will need to know the mode of the interrupted program in order to know where to find/return seg- ment registers, and also to know how to interpret segment register values. the iret instruction will perform the inverse of the above sequence. only the extended military intel386 processor iret instruction (operand size e 32) can be used, and must be executed at level 0 to change the vm bit to 1. (1) if the nt bit in the flags register is on, an inter- task return is performed. the current state is stored in the current tss, and the link field in the current tss is used to locate the tss for the interrupted task which is to be resumed. otherwise, continue with the following sequence. (2) read the flags image from ss:8 [ esp ] into the flags register. this will set vm to the value ac- tive in the interrupted routine. (3) pop off the instruction pointer cs:eip. eip is popped first, then a 32-bit word is popped which contains the cs value in the lower 16 bits. if vm e 0, this cs load is done as a protected mode segment load. if vm e 1, this will be done as an m8086 segment load. (4) increment the esp register by 4 to bypass the flags image which was ``popped'' in step 1. (5) if vm e 1, load segment registers es, ds, fs, and gs from memory locations ss: [ esp a 8 ] , ss: [ esp a 12 ] , ss: [ esp a 16 ] , and ss: [ esp a 20 ] , respectively, where the new val- ue of esp stored in step 4 is used. since vm e 1, these are done as m8086 segment register loads. else if vm e 0, check that the selectors in es, ds, fs, and gs are valid in the interrupted rou- tine. null out invalid selectors to trap if an at- tempt is made to access through them. 61
military intel386 tm microprocessor (6) if (rpl(cs) l cpl), pop the stack pointer ss:esp from the stack. the esp register is popped first, followed by 32-bits containing ss in the lower 16 bits. if vm e 0, ss is loaded as a protected mode segment register load. if vm e 1, an m8086 segment register load is used. (7) resume execution of the interrupted routine. the vm bit in the flags register (restored from the interrupt routine's stack image in step 1) deter- mines whether the processor resumes the inter- rupted routine in protected mode of virtual m8086 mode. 4.0 functional data 4.1 introduction the military intel386 processor features a straight- forward functional interface to the external hard- ware. the military intel386 processor has separate, parallel buses for data and address. the data bus is 32-bits in width, and bidirectional. the address bus outputs 32-bit address values in the most directly usable form for the high-speed local bus: 4 individual byte enable signals, and the 30 upper-order bits as a binary value. the data and address buses are inter- preted and controlled with their associated control signals. a dynamic data bus sizing feature allows the proc- essor to handle a mix of 32- and 16-bit external bus- es on a cycle-by-cycle basis (see 4.3.4 data bus sizing ). if 16-bit bus size is selected, the m80386 automatically makes any adjustment needed, even performing another 16-bit bus cycle to complete the transfer if that is necessary. 8-bit peripheral devices may be connected to 32-bit or 16-bit buses with no loss of performance. a new address pipelining op- tion is provided and applies to 32-bit and 16-bit buses for substantially improved memory utilization, especially for the most heavily used memory re- sources. the address pipelining option , when selected, typ- ically allows a given memory interface to operate with one less wait state than would otherwise be required (see 4.4.2 address pipelining ). the pipe- lined bus is also well suited to interleaved memory designs. for 16 mhz interleaved memory designs with 100 ns access time drams, zero wait states can be achieved when pipelined addressing is se- lected. when address pipelining is requested by the external hardware, the military intel386 processor will output the address and bus cycle definition of the next bus cycle (if it is internally available) even while waiting for the current cycle to be acknowl- edged. non-pipelined address timing, however, is ideal for external cache designs, since the cache memory will typically be fast enough to allow non-pipelined cy- cles. for maximum design flexibility, the address pipelining option is selectable on a cycle-by-cycle basis. the processor's bus cycle is the basic mechanism for information transfer, either from system to proc- essor, or from processor to system. military intel386 processor bus cycles perform data transfer in a mini- mum of only two clock periods. on a 32-bit data bus, the maximum military intel386 processor transfer bandwidth at 16 mhz is therefore 32 mbytes/sec. any bus cycle will be extended for more than two clock periods, however, if external hardware with- holds acknowledgement of the cycle. at the appro- priate time, acknowledgement is signalled by assert- ing the military intel386 processor ready input. the military intel386 processor can relinquish con- trol of its local buses to allow mastership by other devices, such as direct memory access channels. when relinquished, hlda is the only output pin driv- en by the military intel386 processor, providing near- complete isolation of the processor from its system. the near-complete isolation characteristic is ideal when driving the system from test equipment, and in fault-tolerant applications. functional data covered in this chapter describes the processor's hardware interface. first, the set of signals available at the processor pins is described (see 4.2 signal description ). following that are the signal waveforms occurring during bus cycles (see 4.3 bus transfer mechanism , 4.4 bus functional description and 4.5 other functional descrip- tions ). 4.2 signal description 4.2.1 introduction the signal descriptions sometimes refer to ac tim- ing parameters, such as ``t 25 reset setup time'' and ``t 26 reset hold time.'' the values of these parame- ters can be found in tables 7-4 and 7-5. 4.2.2 clock (clk2) clk2 provides the fundamental timing for the mili- tary intel386 microprocessor. it is divided by two in- ternally to generate the internal processor clock used for instruction execution. the internal clock is comprised of two phases, ``phase one'' and ``phase two.'' each clk2 period is a phase of the internal clock. figure 5-2 illustrates the relationship. if de- sired, the phase of the internal processor clock can be synchronized to a known phase by ensuring the reset signal falling edge meets its applicable setup and hold times, t 25 and t 26 . 62
military intel386 tm microprocessor 271052 1 figure 5-1. functional signal groups 271052 2 figure 5-2. clk2 signal and internal processor clock 4.2.3 data bus (d0 through d31) these three-state bidirectional signals provide the general purpose data path between the m80386 and other devices. data bus inputs and outputs indicate ``1'' when high. the data bus can transfer data on 32- and 16-bit buses using a data bus sizing feature controlled by the bs16 input. see section 4.2.6 bus contol . data bus reads require that read data setup and hold times t 21 and t 22 be met for correct opera- tion. during any write operation (and during halt cy- cles and shutdown cycles), the military intel386 processor always drives all 32 signals of the data bus even if the current bus size is 16-bits. 4.2.4 address bus (be0 through be3 , a2 through a31) these three-state outputs provide physical memory addresses or i/o port addresses. the address bus is capable of addressing 4 gigabytes of physical memory space (00000000h through ffffffffh), and 64 kilobytes of i/o address space (00000000h through 0000ffffh) for programmed i/o. i/o transfers automatically generated for military intel386 processor-to-coprocessor communication use i/o addresses 800000f8h through 800000ffh, so a31 high in conjunction with m/io low allows simple generation of the coprocessor select signal. 63
military intel386 tm microprocessor the byte enable outputs, be0 be3 , directly indi- cate which bytes of the 32-bit data bus are involved with the current transfer. this is most convenient for external hardware. be0 applies to d0 d7 be1 applies to d8 d15 be2 applies to d16 d23 be3 applies to d24 d31 the number of byte enables asserted indicates the physical size of the operand being transferred (1, 2, 3, or 4 bytes). refer to section 4.3.6 operand alignment . when a memory write cycle or i/o write cycle is in progress, and the operand being transferred occu- pies only the upper 16 bits of the data bus (d16 d31), duplicate data is simultaneously presented on the corresponding lower 16-bits of the data bus (d0 d15). this duplication is performed for optimum write performance on 16-bit buses. the pattern of write data duplication is a function of the byte en- ables asserted during the write cycle. table 5-1 lists the write data present on d0 d31, as a function of the asserted byte enable outputs be0 be3 . 4.2.5 bus cycle definition signals (w/r , d/c , m/io , lock ) these three-state outputs define the type of bus cycle being performed. w/r distinguishes between write and read cycles. d/c distinguishes between data and control cycles. m/io distinguishes between memory and i/o cycles. lock distinguishes be- tween locked and unlocked bus cycles. the primary bus cycle definition signals are w/r , d/c and m/io , since these are the signals driven valid as the ads (address status output) is driven asserted. the lock is driven valid at the same time as the first locked bus cycle begins, which due to address pipelining, could be later than ads is driven asserted. see 4.4.3.4 pipelined address. the lock is negated when the ready input terminates the last bus cycle which was locked. exact bus cycle definitions, as a function of w/r , d/c , and m/io , are given in table 5-2. note one combination of w/r , d/c and m/io is never given when ads is asserted (however, that combination, which is listed as ``does not occur,'' will occur during idle bus states when ads is not asserted). if m/io , d/c , and w/r are qualified by ads asserted, then a decoding scheme may use the non-occurring combi- nation to its best advantage. table 5-1. write data duplication as a function of be0 be3 military intel386 military intel386 duplication? automatic processor byte enables processor write data be3 be2 be1 be0 d24 d31 d16 d23 d8 d15 d0 d7 high high high low undef undef undef a no high high low high undef undef b undef no high low high high undef c undef c yes low high high high d undef d undef yes high high low low undef undef b a no high low low high undef c b undef no low low high high d c d c yes high low low low undef c b a no low low low high d c b undef no low low low low d c b a no key: d e logical write data d24 d31 c e logical write data d16 d23 b e logical write data d8 d15 a e logical write data d0 d7 64
military intel386 tm microprocessor table 5-2. bus cycle definition m/io d/c w/r bus cycle type locked? low low low interrupt acknowledge yes low low high does not occur e low high low i/o data read no low high high i/o data write no high low low memory code read no high low high halt: shutdown: no address e 2 address e 0 (be0 high (be0 low be1 high be1 high be2 low be2 high be3 high be3 high a2 a31 low) a2 a31 low) high high low memory data read some cycles high high high memory data write some cycles 4.2.6 bus control signals 4.2.6.1 introduction the following signals allow the processor to indicate when a bus cycle has begun, and allow other system hardware to control address pipelining, data bus width and bus cycle termination. 4.2.6.2 address status (ads ) this three-state output indicates that a valid bus cy- cle definition, and address (w/r , d/c , m/io , be0 be3 , and a2 a31) is being driven at the military intel386 processor pins. it is asserted during t1 and t2p bus states (see 4.4.3.2 non-pipelined ad- dress and 4.4.3.4 pipelined address for additional information on bus states). 4.2.6.3 transfer acknowledge (ready ) this input indicates the current bus cycle is com- plete, and the active bytes indicated by be0 be3 and bs16 are accepted or provided. when ready is sampled asserted during a read cycle or interrupt acknowledge cycle, the military intel386 processor latches the input data and terminates the cycle. when ready is sampled asserted during a write cycle, the processor terminates the bus cycle. ready is ignored on the first bus state of all bus cycles, and sampled each bus state thereafter until asserted. ready must eventually be asserted to ac- knowledge every bus cycle, including halt indication and shutdown indication bus cycles. when be- ing sampled, ready must always meet setup and hold times t 19 and t 20 for correct operation. see all sections of 4.4 bus functional description . 4.2.6.4 next address request (na ) this is used to request address pipelining. this input indicates the system is prepared to accept new val- ues of be0 be3 , a2 a31, w/r , d/c and m/io from the military intel386 processor even if the end of the current cycle is not being acknowledged on ready . if this input is asserted when sampled, the next address is driven onto the bus, provided the next bus request is already pending internally. see 4.4.2 address pipelining and 4.4.3 read and write cycles . 4.2.6.5 bus size 16 (bs16 ) the bs16 feature allows the military intel386 proc- essor to directly connect to 32-bit and 16-bit data buses. asserting this input constrains the current bus cycle to use only the lower-order half (d0 d15) of the data bus, corresponding to be0 and be1 . as- serting bs16 has no additional effect if only be0 and/or be1 are asserted in the current cycle. how- ever, during bus cycles asserting be2 or be3 , as- serting bs16 will automatically cause the military intel386 processor to make adjustments for correct transfer of the upper bytes(s) using only physical data signals d0 d15. if the operand spans both halves of the data bus and bs16 is asserted, the military intel386 proces- sor will automatically perform another 16-bit bus cy- cle. bs16 must always meet setup and hold times t 17 and t 18 for correct operation. 65
military intel386 tm microprocessor military intel386 processor i/o cycles are automati- cally generated for coprocessor communication. since the military intel386 processor must transfer 32-bit quantities between itself and the m387 npx, bs16 must not be asserted during m387 npx com- munication cycles. 4.2.7 bus arbitration signals 4.2.7.1 introduction this section describes the mechanism by which the processor relinquishes control of its local buses when requested by another bus master device. see 4.5.1 entering and exiting hold acknowledge for additional information. 4.2.7.2 bus hold request (hold) this input indicates some device other than the mili- tary intel386 processor requires bus mastership. hold must remain asserted as long as any other device is a local bus master. hold is not recognized while reset is asserted. if reset is asserted while hold is asserted, reset has priority and places the bus into an idle state, rather than the hold ac- knowledge (high impedance) state. hold is level-sensitive and is a synchronous input. hold signals must always meet setup and hold times t 23 and t 24 for correct operation. 4.2.7.3 bus hold acknowledge (hlda) assertion of this output indicates the military intel386 processor has relinquished control of its lo- cal bus in response to hold asserted, and is in the bus hold acknowledge state. the hold acknowledge state offers near-complete signal isolation. in the hold acknowledge state, hlda is the only signal being driven by the military 386 processor. the other output signals or bidirec- tional signals (d0 d31, be0 be3 , a2 a31, w/r , d/c , m/io , lock and ads ) are in a high-imped- ance state so the requesting bus master may control them. pullup resistors may be desired on several sig- nals to avoid spurious activity when no bus master is driving them. see 6.2.3 resistor recommenda- tions . also, one rising edge occuring on the nmi input during hold acknowledge is remembered, for processing after the hold input is negated. in addition to the normal usage of hold acknowl- edge with dma controllers or master peripherals, the near-complete isolation has particular attractive- ness during system test when test equipment drives the system, and in hardware-fault-tolerant applica- tions. 4.2.8 coprocessor interface signals 4.2.8.1 introduction in the following sections are descriptions of signals dedicated to the numeric coprocessor interface. in addition to the data bus, address bus, and bus cycle definition signals, these following signals control communication between the military intel386 micro- processor and its m387 processor extension. 4.2.8.2 coprocessor request (pereq) when asserted, this input signal indicates a coproc- essor request for a data operand to be transferred to/from memory by the military intel386 processor. in response, the military intel386 processor trans- fers information between the coprocessor and mem- ory. because the military intel386 processor has in- ternally stored the coprocessor opcode being exe- cuted, it performs the requested data transfer with the correct direction and memory address. pereq is level-sensitive and is allowed to be asyn- chronous to the clk2 signal. 4.2.8.3 coprocessor busy (busy ) when asserted, this input indicates the coprocessor is still executing an instruction, and is not yet able to accept another. when the military intel386 proces- sor encounters any coprocessor instruction which operates on the numeric stack (e.g. load, pop, or arithmetic operation), or the wait instruction, this input is first automatically sampled until it is seen to be negated. this sampling of the busy input pre- vents overrunning the execution of a previous co- processor instruction. the fninit and fnclex coprocessor instructions are allowed to execute even if busy is asserted, since these instructions are used for coprocessor initialization and exception-clearing. busy is level-sensitive and is allowed to be asyn- chronous to the clk2 signal. busy serves an additional function. if busy is sam- pled low at the falling edge of reset, the military intel386 processor performs an internal self-test (see 4.5.3 bus activity during and following re- set ). if busy is sampled high, no self-test is per- formed. 66
military intel386 tm microprocessor 4.2.8.4 coprocessor error (error ) this input signal indicates that the previous coproc- essor instruction generated a coprocessor error of a type not masked by the coprocessor's control regis- ter. this input is automatically sampled by the mili- tary intel386 processor when a coprocessor instruc- tion is encountered, and if asserted, the military intel386 processor generates exception 16 to ac- cess the error-handling software. several coprocessor instructions, generally those which clear the numeric error flags in the coproces- sor or save coprocessor state, do execute without the military intel386 processor generating exception 16 even if error is asserted. these instructions are fninit, fnclex, fstsw, fstswax, fstcw, fstenv, fsave, festenv and fesave. error is level-sensitive and is allowed to be asyn- chronous to the clk2 signal. error serves an additional function. if error is low no later than 20 clk2 periods after the falling edge of reset and remains low at least until the military intel386 processor begins its first bus cycle, a military i387 npx is assumed to be present (et bit in cr0 automatically gets set to 1). otherwise, an m80287 (or no coprocessor) is assumed to be pres- ent (et bit in cr0 automatically is reset to 0). see 4.5.3 bus activity during and after reset . only the et bit is set by this error pin test. software must set the em and mp bits in cr0 as needed. therefore, distinguishing m80287 presence from no coprocessor requires a software test and appropri- ately resetting or setting the em bit of cr0 (set em e 1 when no coprocessor is present). if error is sampled low after reset (indicating military i387 npx) but software later sets em e 1, the military intel386 processor will behave as if no coprocessor is present. 4.2.9 interrupt signals 4.2.9.1 introduction the following descriptions cover inputs that can in- terrupt or suspend execution of the processor's cur- rent instruction stream. 4.2.9.2 maskable interrupt request (intr) when asserted, this input indicates a request for in- terrupt service, which can be masked by the military intel386 processor flag register if bit. when the military intel386 processor responds to the intr in- put, it performs two interrupt acknowledge bus cy- cles, and at the end of the second, latches an 8-bit interrupt vector on d0 d7 to identify the source of the interrupt. intr is level-sensitive and is allowed to be asyn- chronous to the clk2 signal. to assure recognition of an intr request, intr should remain asserted until the first interrupt acknowledge bus cycle be- gins. 4.2.9.3 non-maskable interrupt request (nmi) this input indicates a request for interrupt service, which cannot be masked by software. the non- maskable interrupt request is always processed ac- cording to the pointer or gate in slot 2 of the interrupt table. because of the fixed nmi slot assignment, no interrupt acknowledge cycles are perfomed when processing nmi. nmi is rising edge-sensitive and is allowed to be asynchronous to the clk2 signal. to assure recog- nition of nmi, it must be negated for at least eight clk2 periods, and then be asserted for at least eight clk2 periods. once nmi processing has begun, no additional nmi's are processed until after the next iret in- struction, which is typically the end of the nmi serv- ice routine. if nmi is re-asserted prior to that time, however, one rising edge on nmi will be remem- bered for processing after executing the next iret instruction. 4.2.9.4 reset (reset) this input signal suspends any operation in progress and places the military intel386 processor in a known reset state. the military intel386 processor is reset by asserting reset for 15 or more clk2 peri- ods (80 or more clk2 periods before requesting self test). when reset is asserted, all other input pins are ignored, and all other bus pins are driven to an idle bus state as shown in table 5-3. if reset and hold are both asserted at a point in time, reset takes priority even if the military intel386 processor was in a hold acknowledge state prior to reset asserted. reset is level-sensitive and must be synchronous to the clk2 signal. if desired, the phase of the inter- nal processor clock, and the entire military intel386 processor state can be completely synchronized to external circuitry by ensuring the reset signal fall- ing edge meets its applicable setup and hold times, t 25 and t 26 . table 5-3. pin state (bus idle) during reset pin name signal level during reset ads high d0 d31 high impedance be0 be3 low a2 a31 high w/r low d/c high m/io low lock high hlda low 67
military intel386 tm microprocessor 4.2.10 signal summary table 5-4 summarizes the characteristics of all military intel386 processor signals. table 5-4. military intel386 tm processor signal summary input output signal name signal function active input/ synch or high impedance state output asynch during hlda? to clk2 clk2 clock e i e e d0 d31 data bus high i/o s yes be0 be3 byte enables low o e yes a2 a31 address bus high o e yes w/r write-read indication high o e yes d/c data-control indication high o e yes m/io memory-i/o indication high o e yes lock bus lock indication low o e yes ads address status low o e yes na next address request low i s e bs16 bus size 16 low i s e ready transfer acknowledge low i s e hold bus hold request high i s e hlda bus hold acknowledge high o e no pereq coprocessor request high i a e busy coprocessor busy low i a e error coprocessor error low i a e intr maskable interrupt request high i a e nmi non-maskable intrpt request high i a e reset reset high i s e 4.3 bus transfer mechanism 4.3.1 introduction all data transfers occur as a result of one or more bus cycles. logical data operands of byte, word and double-word lengths may be transferred without re- strictions on physical address alignment. any byte boundary may be used, although two or even three physical bus cycles are performed as required for unaligned operand transfers. see 4.3.4 dynamic data bus sizing and 4.3.6 operand alignment . the military intel386 microprocessor address sig- nals are designed to simplify external system hard- ware. higher-order address bits are provided by a2 a31. lower-order address in the form of be0 be3 directly provides linear selects for the four bytes of the 32-bit data bus. physical operand size informa- tion is thereby implicitly provided each bus cycle in the most usable form. byte enable outputs be0 be3 are asserted when their associated data bus bytes are involved with the present bus cycle, as listed in table 5-5. during a bus cycle, any possible pattern of contiguous, as- serted byte enable outputs can occur, but never pat- terns having a negated byte enable separating two or three asserted enables. 68
military intel386 tm microprocessor address bits a0 and a1 of the physical operand's base address can be created when necessary (for instance, for multibus i or multibus ii interface), as a function of the lowest-order asserted byte enable. this is shown by table 5-6. logic to generate a0 and a1 is given by figure 5-3. table 5-5. byte enables and associated data and operand bytes byte enable signal associated data bus signals be0 d0 d7 (byte 0eleast significant) be1 d8 d15 (byte 1) be2 d16 d23 (byte 2) be3 d24 d31 (byte 3emost significant) table 5-6. generating a0 a31 from be0 be3 and a2 a31 m80386 address signals a31 a2 be3 be2 be1 be0 physical base address a31 a2 a1 a0 a31 a2 0 0 x x x low a31 a2 0 1 x x low high a31 a2 1 0 x low high high a31 a2 1 1 low high high high 271052 3 k - map for a1 signal 271052 4 k - map for a0 signal figure 5-3. logic to generate a0, a1 from be0 be3 each bus cycle is composed of at least two bus states. each bus state requires one processor clock period. additional bus states added to a single bus cycle are called wait states. see 4.4 bus functional description . since a bus cycle requires a minimum of two bus states (equal to two processor clock periods), data can be transferred between external devices and the military intel386 processor at a maximum rate of one 4-byte dword every two processor clock peri- ods, for a maximum bus bandwidth of 32 megaby- tes/second (16 mhz military intel386 processor clock rate). 4.3.2 memory and i/o spaces bus cycles may access physical memory space or i/o space. peripheral devices in the system may ei- ther be memory-mapped, or i/o-mapped, or both. as shown in figure 5-4, physical memory addresses range from 00000000h to ffffffffh (4 gigabytes) and i/o addresses from 00000000h to 0000ffffh (64 kilobytes) for programmed i/o. note the i/o ad- dresses used by the automatic i/o cycles for co- processor communication are 800000f8h to 800000ffh, beyond the address range of pro- grammed i/o, to allow easy generation of a coproc- essor chip select signal using the a31 and m/io sig- nals. 69
military intel386 tm microprocessor 271052 5 physical memory space i/o space note: since a31 is high during automatic communication with coprocessor, a31 high and m/io low can be used to easily generate a coprocessor select signal. figure 5-4. physical memory and i/o spaces 4.3.3 memory and i/o organization the military intel386 processor datapath to memory and i/o spaces can be 32 bits wide or 16 bits wide. when 32-bits wide, memory and i/o spaces are or- ganized naturally as arrays of physical 32-bit dwords. each memory or i/o dword has four indi- vidually addressable bytes at consecutive byte ad- dresses. the lowest-addressed byte is associated with data signals d0 d7; the highest-addressed byte with d24 d31. the military intel386 processor includes a bus con- trol input, bs16 , that also allows direct connection to 16-bit memory or i/o spaces organized as a se- quence of 16-bit words. cycles to 32-bit and 16-bit memory or i/o devices may occur in any sequence, since the bs16 control is sampled during each bus cycle. see 4.3.4 dynamic data bus sizing . the byte enable signals, be0 be3 , allow byte granulari- ty when addressing any memory or i/o structure, whether 32 or 16 bits wide. 4.3.4 dynamic data bus sizing dynamic data bus sizing is a feature allowing direct processor connection to 32-bit or 16-bit data buses for memory or i/o. a single processor may connect to both size buses. transfers to or from 32- or 16-bit ports are supported by dynamically determining the bus width during each bus cycle. during each bus cycle an address decoding circuit or the slave de- vice itself may assert bs16 for 16-bit ports, or ne- gate bs16 for 32-bit ports. with bs16 asserted, the processor automatically converts operand transfers larger than 16 bits, or misaligned 16-bit transfers, into two or three trans- fers as required. all operand transfers physically oc- cur on d0 d15 when bs16 is asserted. therefore, 16-bit memories or i/o devices only connect on data signals d0 d15. no extra transceivers are re- quired. asserting bs16 only affects the processor when be2 and/or be3 are asserted during the current cy- cle. if only d0 d15 are involved with the transfer, asserting bs16 has no affect since the transfer can proceed normally over a 16-bit bus whether bs16 is asserted or not. in other words, asserting bs16 has no effect when only the lower half of the bus is in- volved with the current cycle. there are two types of situations where the proces- sor is affected by asserting bs16 , depending on which byte enables are asserted during the current bus cycle: upper half only: only be2 and/or be3 asserted. upper and lower half: at least be1 , be2 asserted (and perhaps also be0 and/or be3 ). 70
military intel386 tm microprocessor effect of asserting bs16 during ``upper half only'' read cycles: asserting bs16 during ``upper half only'' reads causes the military intel386 processor to read data on the lower 16 bits of the data bus and ig- nore data on the upper 16 bits of the data bus. data that would have been read from d16 d31 (as indicated by be2 and be3 ) will instead be read from d0 d15 respectively. effect of asserting bs16 during ``upper half only'' write cycles: asserting bs16 during ``upper half only'' writes does not affect the military intel386 processor. when only be2 and/or be3 are asserted during a write cycle the military intel386 processor always duplicates data signals d16 d31 onto d0 d15 (see table 5-1). therefore, no further military intel386 processor action is required to perform these writes on 32-bit or 16-bit buses. effect of asserting bs16 during ``upper and lower half'' read cycles: asserting bs16 during ``upper and lower half'' reads causes the processor to perform two 16-bit read cycles for complete physical operand trans- fer. bytes 0 and 1 (as indicated by be0 and be1 ) are read on the first cycle using d0 d15. bytes 2 and 3 (as indicated by be2 and be3 ) are read during the second cycle, again using d0 d15. d16 d31 are ignored during both 16-bit cycles. be0 and be1 are always negated during the sec- ond 16-bit cycle (see figure 5-14, cycles 2 and 2a ). effect of asserting bs16 during ``upper and lower half'' write cycles: asserting bs16 during ``upper and lower half'' writes causes the military intel386 processor to perform two 16-bit write cycles for complete physi- cal operand transfer. all bytes are available the first write cycle allowing external hardware to re- ceive bytes 0 and 1 (as indicated by be0 and be1 ) using d0 d15. on the second cycle the mili- tary intel386 processor duplicates bytes 2 and 3 on d0 d15 and bytes 2 and 3 (as indicated by be2 and be3 ) are written using d0 d15. be0 and be1 are always negated during the second 16-bit cycle. bs16 must be asserted during the second 16-bit cycle. see figure 5-14, cycles 1 and 1a . 4.3.5 interfacing with 32- and 16-bit memories in 32-bit-wide physical memories such as figure 5-5, each physical dword begins at a byte address that is a multiple of 4. a2 a31 are directly used as a dword select and be0 be3 as byte selects. bs16 is negat- ed for all bus cycles involving the 32-bit array. when 16-bit-wide physical arrays are included in the system, as in figure 5-6, each 16-bit physical word begins at a address that is a multiple of 2. note the address is decoded, to assert bs16 only during bus cycles involving the 16-bit array. (if desiring to use 271052 6 figure 5-5. military intel386 tm processor with 32-bit memory 271052 7 figure 5-6. military intel386 tm processor with 32-bit and 16-bit memory 71
military intel386 tm microprocessor pipelined address with 16-bit memories then be0 be3 and w/r are also decoded to determine when bs16 should be asserted. see 4.4.3.7 maximum pipelined address usage with 16-bit bus size .) a2 a31 are directly usable for addressing 32-bit and 16-bit devices. to address 16-bit devices, a1 and two byte enable signals are also needed. to generate an a1 signal and two byte enable sig- nals for 16-bit access, be0 be3 should be decoded as in table 5-7. note certain combinations of be0 be3 are never generated by the military intel386 processor, leading to ``don't care'' conditions in the decoder. any be0 be3 decoder, such as figure 5-7, may use the non-occurring be0 be3 combina- tions to its best advantage. 4.3.6 operand alignment with the flexibility of memory addressing on the mili- tary intel386 processor, it is possible to transfer a logical operand that spans more than one physical dword or word of memory or i/o. examples are 32- bit dword operands beginning at addresses not evenly divisible by 4, or a 16-bit word operand split between two physical dwords of the memory array. operand alignment and data bus size dictate when multiple bus cycles are required. table 5-8 describes the transfer cycles generated for all combinations of logical operand lengths, alignment, and data bus siz- ing. when multiple bus cycles are required to trans- fer a multi-byte logical operand, the highest-order bytes are transferred first (but if bs16 asserted re- quires two 16-bit cycles be performed, that part of the transfer is low-order first). 4.4 bus functional description 4.4.1 introduction the military intel386 processor has separate, paral- lel buses for data and address. the data bus is 32- bits in width, and bidirectional. the address bus pro- vides a 32-bit value using 30 signals for the 30 up- per-order address bits and 4 byte enable signals to directly indicate the active bytes. these buses are interpreted and controlled via several associated definition or control signals. table 5-7. generating a1, bhe and ble for addressing 16-bit devices m80386 signals 16-bit bus signals comments be3 be2 be1 be0 a1 bhe ble (a0) h * h * h * h * x x x xeno active bytes hhhllh l hhl hll h hhl lll l hl hhhh l h * l * h * l * x x x xenot contiguous bytes hllhll h hlllll l l hhhhl h l * h * h * l * x x x xenot contiguous bytes l * h * l * h * x x x xenot contiguous bytes l * h * l * l * x x x xenot contiguous bytes llhhhl l l * l * h * l * x x x xenot continguous bytes lllhll h llllll l ble asserted when d0 d7 of 16-bit bus is active. bhe asserted when d8 d15 of 16-bit bus is active. a1 low for all even words; a1 high for all odd words. key: x e don't care h e high voltage level l e low voltage level * e a non-occurring pattern of byte enables; either none are asserted, or the pattern has byte enables asserted for non-contiguous bytes 72
military intel386 tm microprocessor 271052 8 k-map for a1 signal (same as figure 5-3) 271052 9 k-map for 16-bit bhe signal 271052 10 k-map for 16-bit ble signal (same as a0 signal in figure 5-3) figure 5-7. logic to generate a1, bhe and ble for 16-bit buses table 5-8. transfer bus cycles for bytes, words and dwords byte-length of logical operand 12 4 physical byte address xx 00 01 10 11 00 01 10 11 in memory (low-order bits) transfer cycles over b w w w hb, * d hb hw, h3, 32-bit data bus lb l3 lw lb transfer cycles over b w lb, w hb, lw, hb, hw, mw, 16-bit data bus hb lb hw lb, lw hb, mw lb key: b e byte transfer 3 e 3-byte transfer w e word transfer d e dword transfer l e low-order portion h e high-order portion m e mid-order portion x e don't care e bs16 asserted causes second bus cycle * for this case, m8086, 88, 186, 188, 286 transfer lb first, then hb. 73
military intel386 tm microprocessor the definition of each bus cycle is given by three definition signals: m/io , w/r and d/c . at the same time, a valid address is present on the byte enable signals be0 be3 and other address signals a2 a31. a status signal, ads , indicates when the m80386 issues a new bus cycle definition and ad- dress. collectively, the address bus, data bus and all asso- ciated control signals are referred to simply as ``the bus''. when active, the bus performs one of the bus cycles below: 1) read from memory space 2) locked read from memory space 3) write to memory space 4) locked write to memory space 5) read from i/o space (or coprocessor) 6) write to i/o space (or coprocessor) 7) interrupt acknowledge 8) indicate halt, or indicate shutdown table 5-2 shows the encoding of the bus cycle defi- nition signals for each bus cycle. see section 4.2.5 bus cycle definition . the data bus has a dynamic sizing feature support- ing 32- and 16-bit bus size. data bus size is indicated to the military intel386 processor using its bus size 16 (bs16 ) input. all bus functions can be performed with either data bus size. when the military intel386 processor bus is not per- forming one of the activities listed above, it is either idle or in the hold acknowledge state, which may be detected by external circuitry. the idle state can be identified by the military intel386 processor giving no further assertions on its address strobe output (ads ) since the beginning of its most recent bus cycle, and the most recent bus cycle has been terminated. the hold acknowledge state is identified by the military intel386 processor asserting its hold acknowledge (hlda) output. the shortest time unit of bus activity is a bus state. a bus state is one processor clock period (two clk2 periods) in duration. a complete data transfer occurs during a bus cycle, composed of two or more bus states. 271052 11 fastest non-pipelined bus cycles consist of t1 and t2 figure 5-8. fastest read cycles with non-pipelined address timing 74
military intel386 tm microprocessor the fastest military intel386 processor bus cycle re- quires only two bus states. for example, three con- secutive bus read cycles, each consisting of two bus states, are shown by figure 5-8. the bus states in each cycle are named t1 and t2 . any memory or i/o address may be accessed by such a two-state bus cycle, if the external hardware is fast enough. the high-bandwidth, two-clock bus cycle realizes the full potential of fast main memory, or cache memory. every bus cycle continues until it is acknowledged by the external system hardware, using the military intel386 processor ready input. acknowledging the bus cycle at the end of the first t2 results in the shortest bus cycle, requiring only t1 and t2. if ready is not immediately asserted, however, t2 states are repeated indefinitely until the ready in- put is sampled asserted. 4.4.2 address pipelining the address pipelining option provides a choice of bus cycle timings. pipelined or non-pipelined ad- dress timing is selectable on a cycle-by-cycle basis with the next address (na ) input. when address pipelining is not selected, the current address and bus cycle definition remain stable throughout the bus cycle. when address pipelining is selected, the address (be0 be3 , a2 a31) and definition (w/r , d/c and m/io ) of the next cycle are available before the end of the current cycle. to signal their availability, the military intel386 processor address status output (ads ) is also asserted. figure 5-9 illustrates the fast- est read cycles with pipelined address timing. note from figure 5-9 the fastest bus cycles using pipelined address require only two bus states, named t1p and t2p . therefore cycles with pipe- lined address timing allow the same data bandwidth as non-pipelined cycles, but address-to-data access time is increased compared to that of a non-pipe- lined cycle. by increasing the address-to-data access time, pipe- lined address timing reduces wait state require- ments. for example, if one wait state is required with non-pipelined address timing, no wait states would be required with pipelined address. 271052 12 fastest pipelined bus cycles consist of t1p and t2p figure 5-9. fastest read cycles with pipelined address timing 75
military intel386 tm microprocessor pipelined address timing is useful in typical systems having address latches. in those systems, once an address has been latched, pipelined availability of the next address allows decoding circuitry to gener- ate chip selects (and other necessary select signals) in advance, so selected devices are accessed im- mediately when the next cycle begins. in other words, the decode time for the next cycle can be overlapped with the end of the current cycle. if a system contains a memory structure of two or more interleaved memory banks, pipelined address timing potentially allows even more overlap of activi- ty. this is true when the interleaved memory control- ler is designed to allow the next memory operation to begin in one memory bank while the current bus cycle is still activating another memory bank. figure 5-10 shows the general structure of the military intel386 processor with 2-bank and 4-bank inter- leaved memory. note each memory bank of the in- terleaved memory has full data bus width (32-bit data width typically, unless 16-bit bus size is select- ed). further details of pipelined address timing are given in 4.4.3.4 pipelined address, 4.4.3.5 initiating and maintaining pipelined address, 4.4.3.6 pipelined address with dynamic bus sizing, and 4.4.3.7 maximum pipelined address usage with 16-bit bus size . two-bank interleaved memory a) address signal a2 selects bank b) 32-bit datapath to each bank 271052 13 four-bank interleaved memory a) address signals a3 and a2 select bank b) 32-bit datapath to each bank 271052 14 figure 5-10. 2-bank and 4-bank interleaved memory structure 76
military intel386 tm microprocessor 4.4.3 read and write cycles 4.4.3.1 introduction data transfers occur as a result of bus cycles, classi- fied as read or write cycles. during read cycles, data is transferred from an external device to the proces- sor. during write cycles data is transferred in the oth- er direction, from the processor to an external de- vice. two choices of address timing are dynamically se- lectable: non-pipelined, or pipelined. after a bus idle state, the processor always uses non-pipelined ad- dress timing. however, the na (next address) input may be asserted to select pipelined address timing for the next bus cycle. when pipelining is se- lected and the military intel386 processor has a bus request pending internally, the address and defini- tion of the next cycle is made available even before the current bus cycle is acknowledged by ready . generally, the na input is sampled each bus cycle to select the desired address timing for the next bus cycle. two choices of physical data bus width are dynami- cally selectable: 32 bits, or 16 bits. generally, the bs16 (bus size 16) input is sampled near the end of the bus cycle to confirm the physical data bus size applicable to the current cycle. negation of bs16 indicates a 32-bit size, and assertion indicates a 16- bit bus size. if 16-bit bus size is indicated, the military intel386 processor automatically responds as required to complete the transfer on a 16-bit data bus. depend- ing on the size and alignment of the operand, anoth- er 16-bit bus cycle may be required. table 5-7 pro- vides all details. when necessary, the military in- tel386 processor performs an additional 16-bit bus cycle, using d0 d15 in place of d16 d31. terminating a read cycle or write cycle, like any bus cycle, requires acknowledging the cycle by asserting the ready input. until acknowledged, the proces- sor inserts wait states into the bus cycle, to allow adjustment for the speed of any external device. ex- ternal hardware, which has decoded the address and bus cycle type asserts the ready input at the appropriate time. 271052 15 idle states are shown here for diagram variety only. write cycles are not always followed by an idle state. an active bus cycle can immediately follow the write cycle. figure 5-11. various bus cycles and idle states with non-pipelined address (zero wait states) 77
military intel386 tm microprocessor at the end of the second bus state within the bus cycle, ready is sampled. at that time, if external hardware acknowledges the bus cycle by asserting ready , the bus cycle terminates as shown in figure 5-11. if ready is negated as in figure 5-12, the cycle continues another bus state (a wait state) and ready is sampled again at the end of that state. this continues indefinitely until the cycle is acknowl- edged by ready asserted. when the current cycle is acknowledged, the military intel386 processor terminates it. when a read cycle is acknowledged, the military intel386 processor latches the information present at its data pins. when a write cycle is acknowledged, the military intel386 processor write data remains valid through- out phase one of the next bus state, to provide write data hold time. 4.4.3.2 non-pipelined address any bus cycle may be performed with non-pipelined address timing. for example, figure 5-11 shows a mixture of read and write cycles with non-pipelined address timing. figure 5-11 shows the fastest possi- ble cycles with non-pipelined address have two bus states per bus cycle. the states are named t1 and t2. in phase one of the t1, the address signals and bus cycle definition signals are driven valid, and to signal their availability, address status (ads )is simultaneously asserted. during read or write cycles, the data bus behaves as follows. if the cycle is a read, the military intel386 processor floats its data signals to allow driving by the external device being addressed. if the cycle is a write, data signals are driven by the military intel386 processor beginning in phase two of t1 until phase one of the bus state following cycle acknowledg- ment. figure 5-12 illustrates non-pipelined bus cycles with one wait added to cycles 2 and 3. ready is sam- pled negated at the end of the first t2 in cycles 2 and 3. therefore cycles 2 and 3 have t2 repeated. at the end of the second t2, ready is sampled asserted. 271052 16 idle states are shown here for diagram variety only. write cycles are not always followed by an idle state. an active bus cycle can immediately follow the write cycle. figure 5-12. various bus cycles and idle states with non-pipelined address (various number of wait states) 78
military intel386 tm microprocessor bus states: 271052 17 t1efirst clock of a non-pipelined bus cycle (military intel386 processor drives new address and asserts ads ) t2esubsequent clocks of a bus cycle when na has not been sampled asserted in the current bus cycle tie idle state thehold acknowledge state (military intel386 processor asserts hlda) the fastest bus cycle consists of two states: t1 and t2. four basic bus states describe bus operation when not using pipelined address. these states do include bs16 usage for 32-bit and 16-bit bus size. if asserting bs16 requires a second 16-bit bus cycle to be performed, it is performed before hold asserted is acknowledged. figure 5-13. military intel386 tm processor bus states (not using pipelined address) when address pipelining is not used, the address and bus cycle definition remain valid during all wait states. when wait states are added and you desire to maintain non-pipelined address timing, it is neces- sary to negate na during each t2 state except the last one, as shown in figure 5-12 cycles 2 and 3. if na is sampled asserted during a t2 other than the last one, the next state would be t2i (for pipelined address) or t2p (for pipelined address) instead of another t2 (for non-pipelined address). when address pipelining is not used, the bus states and transitions are completely illustrated by figure 5-13. the bus transitions between four possible states: t1, t2, ti, and th. bus cycles consist of t1 and t2, with t2 being repeated for wait states. oth- erwise, the bus may be idle, in the ti state, or in hold acknowledge, the th state. when address pipelining is not used, the bus state diagram is as shown in figure 5-13. when the bus is idle it is in state ti. bus cycles always begin with t1. t1 always leads to t2. if a bus cycle is not acknowl- edged during t2 and na is negated, t2 is repeated. when a cycle is acknowledged during t2, the follow- ing state will be t1 of the next bus cycle if a bus request is pending internally, or ti if there is no bus request pending, or th if the hold input is being asserted. the bus state diagram in figure 5-13 also applies to the use of bs16 . if the military intel386 processor makes internal adjustments for 16-bit bus size, the adjustments do not affect the external bus states. if an additional 16-bit bus cycle is required to complete a transfer on a 16-bit bus, it also follows the state transitions shown in figure 5-13. use of pipelined address allows the military intel386 processor to enter three additional bus states not shown in figure 5-13. figure 5-20 in 4.4.3.4 pipe- lined address is the complete bus state diagram, including pipelined address cycles. 79
military intel386 tm microprocessor 4.4.3.3 non-pipelined address with dynamic data bus sizing the physical data bus width for any non-pipelined bus cycle can be either 32-bits or 16-bits. at the beginning of the bus cycle, the processor behaves as if the data bus is 32-bits wide. when the bus cy- cle is acknowledged, by asserting ready at the end of a t2 state, the most recent sampling of bs16 determines the data bus size for the cycle being ac- knowledged. if bs16 was most recently negated, the physical data bus size is defined as 32 bits. if bs16 was most recently asserted, the size is de- fined as 16 bits. when bs16 is asserted and two 16-bit bus cycles are required to complete the transfer, bs16 must be asserted during the second cycle; 16-bit bus size is not assumed. like any bus cycle, the second 16-bit cycle must be acknowledged by asserting ready . when a second 16-bit bus cycle is required to com- plete the transfer over a 16-bit bus, the addresses key: dn e physical data pin n 271052 18 dn e logical data bit n figure 5-14. asserting bs16 (zero wait states, non-pipelined address) 80
military intel386 tm microprocessor key: dn e physical data pin n 271052 19 dn e logical data bit n figure 5-15. asserting bs16 (one wait state, non-pipelined address) generated for the two 16-bit bus cycles are closely related to each other. the addresses are the same except be0 and be1 are always negated for the second cycle. this is because data on d0 d15 was already transferred during the first 16-bit cycle. figures 5-14 and 5-15 show cases where assertion of bs16 requires a second 16-bit cycle for complete operand transfer. figure 5-14 illustrates cycles with- out wait states. figure 5-15 illustrates cycles with one wait state. in figure 5-15 cycle 1, the bus cycle during which bs16 is asserted, note that na must be negated in the t2 state(s) prior to the last t2 state. this is to allow the recognition of bs16 asserted in the final t2 state. the relation of na and bs16 is given fully in 4.4.3.4 pipelined address , but figure 5-15 illustrates this only precaution you need to know when using bs16 with non-pipelined address. 81
military intel386 tm microprocessor 4.4.3.4 pipelined address address pipelining is the option of requesting the address and the bus cycle definition of the next, in- ternally pending bus cycle before the current bus cycle is acknowledged with ready asserted. ads is asserted by the military intel386 processor when the next address is issued. the address pipelining option is controlled on a cycle-by-cycle basis with the na input signal. once a bus cycle is in progress and the current ad- dress has been valid for at least one entire bus state, the na input is sampled at the end of every phase one until the bus cycle is acknowledged. dur- ing non-pipelined bus cycles, therefore, na is sam- pled at the end of phase one in every t2. an exam- ple is cycle 2 in figure 5-16, during which na is sampled at the end of phase one of every t2 (it was asserted once during the first t2 and has no further effect during that bus cycle). if na is sampled asserted, the military intel386 proc- essor is free to drive the address and bus cycle defi- nition of the next bus cycle, and assert ads , as soon as it has a bus request internally pending. it may drive the next address as early as the next bus state, whether the current bus cycle is acknowledged at that time or not. regarding the details of address pipelining, the mili- tary intel386 processor has the following character- istics: 1) for na to be sampled asserted, bs16 must be negated at that sampling window (see figure 5-16 cycles 3 and 4, and figure 5-17 cycles 2 through 4). if na and bs16 are both sampled asserted during the last t2 period of a bus cycle, bs16 asserted has priority. therefore, if both are as- serted, the current bus size is taken to be 16 bits and the next address is not pipelined. conceptu- ally, figure 5-18 shows the internal m80386 logic providing these characteristics. 271052 20 following any idle bus state (ti), addresses are non-pipelined. within non-pipelined bus cycles, na is only sampled during wait states. therefore, to begin address pipelining during a group of non-pipelined bus cycles requires a non-pipelined cycle with at least one wait state (cycle 2 above). figure 5-16. transitioning to pipelined address during burst of bus cycles 82
military intel386 tm microprocessor 271052 21 following any idle bus state (ti) the address is always non-pipelined and na is only sampled during wait states. to start address pipelining after an idle state requires a non-pipelined cycle with at least one wait state (cycle 1 above). the pipelined cycles (2, 3, 4 above) are shown with various numbers of wait states. figure 5-17. fastest transition to pipelined address following idle bus state 2) the next address may appear as early as the bus state after na was sampled asserted (see fig- ures 5-16 or 5-17). in that case, state t2p is en- tered immediately. however, when there is not an internal bus request already pending, the next ad- dress will not be available immediately after na is asserted and t2i is entered instead of t2p (see figure 5-19 cycle 3). provided the current bus cy- cle isn't yet acknowledged by ready asserted, t2p will be entered as soon as the m80386 does drive the next address. external hardware should therefore observe the ads output as confirmation the next address is actually being driven on the bus. 3) once na is sampled asserted, the military intel386 processor commits itself to the highest priority bus request that is pending internally. it can no longer perform another 16-bit transfer to the same address should bs16 be asserted ex- ternally, so thereafter must assume the current bus size is 32 bits. therefore if na is sampled asserted within a bus cycle, bs16 must be negat- ed thereafter in that bus cycle (see figures 5-16, 5-17, 5-19). consequently, do not assert na dur- ing bus cycles which must have bs16 driven as- serted. see 4.4.3.6 dynamic bus sizing with pipelined address. 4) any address which is validated by a pulse on the military intel386 processor ads output will remain stable on the address pins for at least two proces- sor clock periods. the military intel386 processor cannot produce a new address more frequently than every two processor clock periods (see fig- ures 5-16, 5-17, 5-19). 5) only the address and bus cycle definition of the very next bus cycle is available. the pipelining ca- pability cannot look further than one bus cycle ahead (see figure 5-19 cycle 1). 83
military intel386 tm microprocessor 271052 22 figure 5-18. military intel386 tm processor internal logic on na and bs16 the complete bus state transition diagram, including operation with pipelined address is given by figure 5-20. note it is a superset of the diagram for non- pipelined address only, and the three additional bus states for pipelined address are drawn in bold. the fastest bus cycle with pipelined address con- sists of just two bus states, t1p and t2p (recall for non-pipelined address it is t1 and t2). t1p is the first bus state of a pipelined cycle. 4.4.3.5 initiating and maintaining pipelined address using the state diagram figure 5-20, observe the transitions from an idle state, ti, to the beginning of a pipelined bus cycle, t1p. from an idle state ti, the first bus cycle must begin with t1, and is therefore a non-pipelined bus cycle. the next bus cycle will be pipelined, however, provided na is asserted and the first bus cycle ends in a t2p state (the address for the next bus cycle is driven during t2p). the fastest path from an idle state to a bus cycle with pipelined address is shown in bold below: ti, ti, ti t1 - t2 - t2p, t1p - t2p, x ? yx ? yx ? y idle non-pipelined pipelined states cycle cycle t1-t2-t2p are the states of the bus cycle that es- tablishes address pipelining for the next bus cycle, which begins with t1p. the same is true after a bus hold state, shown below: th, th, th, t1 - t2 - t2p, t1p - t2p, x ? yx ? yx ? y hold non-pipelined pipelined acknowledge cycle cycle states the transition to pipelined address is shown func- tionally by figure 5-17 cycle 1. note that cycle 1 is used to transition into pipelined address timing for the subsequent cycles 2, 3 and 4, which are pipe- lined. the na input is asserted at the appropriate time to select address pipelining for cycles 2, 3 and 4. once a bus cycle is in progress and the current ad- dress has been valid for one entire bus state, the na input is sampled at the end of every phase one until the bus cycle is acknowledged. during figure 5-17 cycle 1 therefore, sampling begins in t2. once na is sampled asserted during the current cycle, the military intel386 processor is free to drive a new ad- dress and bus cycle definition on the bus as early as the next bus state. in figure 5-16 cycle 1 for exam- ple, the next address is driven during state t2p. thus cycle 1 makes the transition to pipelined ad- dress timing, since it begins with t1 but ends with t2p. because the address for cycle 2 is available before cycle 2 begins, cycle 2 is called a pipelined bus cycle, and it begins with t1p. cycle 2 begins as soon as ready asserted terminates cycle 1. example transition bus cycles are figure 5-17 cycle 1 and figure 5-16 cycle 2. figure 5-17 shows tran- sition during the very first cycle after an idle bus state, which is the fastest possible transition into ad- dress pipelining. figure 5-16 cycle 2 shows a tran- sition cycle occurring during a burst of bus cycles. in any case, a transition cycle is the same whenever it occurs: it consists at least of t1, t2 (you assert na at that time), and t2p (provided the military intel386 processor has an internal bus request already pend- ing, which it almost always has). t2p states are re- peated if wait states are added to the cycle. note three states (t1, t2 and t2p) are only required in a bus cycle performing a transition from non- pipelined address into pipelined address timing, for example figure 5-17 cycle 1. figure 5-17 cycles 2, 3 and 4 show that address pipelining can be main- tained with two-state bus cycles consisting only of t1p and t2p. once a pipelined bus cycle is in progress, pipelined timing is maintained for the next cycle by asserting na and detecting that the military intel386 proces- sor enters t2p during the current bus cycle. the cur- rent bus cycle must end in state t2p for pipelining to be maintained in the next cycle. t2p is identified by the assertion of ads . figures 5-16 and 5-17 howev- er, each show pipelining ending after cycle 4 be- cause cycle 4 ends in t2i. this indicates the military intel386 processor didn't have an internal bus re- quest prior to the acknowledgement of cycle 4. if a cycle ends with a t2 or t2i, the next cycle will not be pipelined. 84
military intel386 tm microprocessor 271052 23 figure 5-19. details of address pipelining during cycles with wait states 85
military intel386 tm microprocessor bus states: t1efirst clock of a non-pipelined bus cycle (military intel386 processor drives new address and asserts ads ). t2esubsequent clocks of a bus cycle when na has not been sampled asserted in the current bus cycle. t2iesubsequent clocks of a bus cycle when na has been sampled assert- ed in the current bus cycle but there is not yet an internal bus request pending (military intel386 processor will not drive new address or assert ads ). t2pesubsequent clocks of a bus cycle when na has been sampled as- serted in the current bus cycle and there is an internal bus request pending (military intel386 processor drives new address and asserts ads ). t1pefirst clock of a pipelined bus cycle. tieidle state. thehold acknowledge state (military intel386 processor asserts hlda). asserting na for pipelined address gives access to three more bus states: t2i, t2p and t1p. using pipelined address, the fastest bus cycle consists of t1p and t2p. 271052 24 figure 5-20. military intel386 tm processor complete bus states (including pipelined address) realistically, address pipelining is almost always maintained as long as na is sampled asserted. this is so because in the absence of any other request, a code prefetch request is always internally pending until the instruction decoder and code prefetch queue are completely full. therefore address pipelin- ing is maintained for long bursts of bus cycles, if the bus is available (i.e., hold negated) and na is sam- pled asserted in each of the bus cycles. 4.4.3.6 pipelined address with dynamic data bus sizing the bs16 feature allows easy interface to 16-bit data buses. when asserted, the military intel386 processor bus interface hardware performs appro- priate action to make the transfer using a 16-bit data bus connected on d0 d15. there is a degree of interaction, however, between the use of address pipelining and the use of bus size 16. the interaction results from the multiple bus cycles required when transferring 32-bit operands over a 16-bit bus. if the operand requires both 16-bit halves of the 32-bit bus, the appropriate military intel386 processor action is a second bus cycle to complete the operand's transfer. it is this necessity that conflicts with na usage. when na is sampled asserted, the military intel386 processor commits itself to perform the next inter- nally pending bus request, and is allowed to drive 86
military intel386 tm microprocessor the next internally pending address onto the bus. as- serting na therefore makes it impossible for the next bus cycle to again access the current address on a2 a31, such as may be required when bs16 is asserted by the external hardware. to avoid conflict, the military intel386 processor is designed with the following two provisions: 1) to avoid conflict, bs16 must be negated in the current bus cycle if na has already been sampled asserted in the current cycle. if na is sampled asserted, the current data bus size is assumed to be 32 bits. 2) to also avoid conflict, if na and bs16 are both asserted during the same sampling window, bs16 asserted has priority and the military intel386 processor acts as if na was negated at that time. internal military intel386 processor circuitry, shown conceptually in figure 5-18, assures that bs16 is sampled asserted and na is sampled negated if both inputs are externally asserted at the same sampling window. key: dn e physical data pin n 271052 25 dn e logical data bit n cycles 1 and 2 are pipelined. cycle 1a cannot be pipelined, but its address can be inferred from that of cycle 1, to externally simulate address pipelining during cycle 1a. figure 5-21. using na and bs16 87
military intel386 tm microprocessor certain types of 16-bit or 8-bit operands require no adjustment for correct transfer on a 16-bit bus. those are read or write operands using only the low- er half of the data bus, and write operands using only the upper half of the bus since the military intel386 processor simultaneously duplicates the write data on the lower half of the data bus. for these patterns of byte enables and the r/w signals, bs16 need not be asserted at the military intel386 processor, allowing na to be asserted during the bus cycle if desired. 4.4.4 interrupt acknowledge (inta) cycles in response to an interrupt request on the intr in- put when interrupts are enabled, the military intel386 processor performs two interrupt acknowledge cy- cles. these bus cycles are similar to read cycles in that bus definition signals define the type of bus ac- tivity taking place, and each cycle continues until ac- knowledged by ready sampled asserted. the state of a2 distinguishes the first and second interrupt acknowledge cycles. the byte address driven during the first interrupt acknowledge cycle is 4 (a31 a3 low, a2 high, be3 be1 high, and be0 low). the address driven during the second interrupt acknowledge cycle is 0 (a31 a2 low, be3 be1 high, be0 low). 271052 26 interrupt vector (0 255) is read on d0 d7 at end of second interrupt acknowledge bus cycle. because each interrupt acknowledge bus cycle is followed by idle bus states, asserting na has no practical effect. choose the approach which is simplest for your system hardware design. figure 5-22. interrupt acknowledge cycles 88
military intel386 tm microprocessor 271052 27 figure 5-23. halt indication cycle the lock output is asserted from the beginning of the first interrupt acknowledge cycle until the end of the second interrupt acknowledge cycle. four idle bus states, ti, are inserted by the military intel386 processor between the two interrupt acknowledge cycles, allowing at least 160 ns of locked idle time for future military intel386 processor speed selec- tions up to 25 mhz (clk2 up to 50 mhz), for com- patibility with spec trhrl of the 8259a interrupt controller. during both interrupt acknowledge cycles, d0 d31 float. no data is read at the end of the first interrupt acknowledge cycle. at the end of the second inter- rupt acknowledge cycle, the military intel386 proces- sor will read an external interrupt vector from d0 d7 of the data bus. the vector indicates the specific interrupt number (from 0 255) requiring service. 4.4.5 halt indication cycle the military intel386 processor halts as a result of executing a halt instruction. signaling its entrance into the halt state, a halt indication cycle is per- formed. the halt indication cycle is identified by the state of the bus definition signals shown in 4.2.5 bus cycle definition and a byte address of 2. be0 and be2 are the only signals distinguishing halt indica- tion from shutdown indication, which drives an ad- dress of 0. during the halt cycle undefined data is driven on d0 d31. the halt indication cycle must be acknowledged by ready asserted. a halted military intel386 processor resumes execu- tion when intr (if interrupts are enabled) or nmi or reset is asserted. 89
military intel386 tm microprocessor 4.4.6 shutdown indication cycle the military intel386 processor shuts down as a re- sult of a protection fault while attempting to process a double fault. signaling its entrance into the shut- down state, a shutdown indication cycle is per- formed. the shutdown indication cycle is identified by the state of the bus definition signals shown in 4.2.5 bus cycle definition and a byte address of 0. be0 and be2 are the only signals distinguishing shutdown indication from halt indication, which drives an address of 2. during the shutdown cycle undefined data is driven on d0 d31. the shutdown indication cycle must be acknowledged by ready asserted. a shutdown military intel386 processor resumes ex- ecution when nmi or reset is asserted. 271052 28 figure 5-24. shutdown indication cycle 90
military intel386 tm microprocessor 4.5 other functional descriptions 4.5.1 entering and exiting hold acknowledge the bus hold acknowledge state, th, is entered in response to the hold input being asserted. in the bus hold acknowledge state, the military intel386 processor floats all output or bidirectional signals, except for hlda. hlda is asserted as long as the military intel386 processor remains in the bus hold acknowledge state. in the bus hold acknowledge state, all inputs except hold, reset, busy , error , and pereq are ignored (also up to one rising edge on nmi is remembered for processing when hold is no longer asserted). 271052 29 note: for maximum design flexibility the military intel386 tm processor has no internal pullup resistors on its out- puts. your design may require an external pullup on ads and other military intel386 processor outputs to keep them negated during float periods. figure 5-25. requesting hold from idle bus th may be entered from a bus idle state as in figure 5-25 or after the acknowledgement of the current physical bus cycle if the lock signal is not asserted, as in figures 5-26 and 5-27. if asserting bs16 re- quires a second 16-bit bus cycle to complete a phys- ical operand transfer, it is performed before hold is acknowledged, although the bus state diagrams in figures 5-13 and 5-20 do not indicate that detail. th is exited in response to the hold input being negated. the following state will be ti as in figure 5-25 if no bus request is pending. the following bus state will be t1 if a bus request is internally pending, as in figures 5-26 and 5-27. th is also exited in response to reset being assert- ed. if a rising edge occurs on the edge-triggered nmi input while in th, the event is remembered as a non- maskable interrupt 2 and is serviced when th is exit- ed, unless of course, the military intel386 processor is reset before th is exited. 4.5.2 reset during hold acknowledge reset being asserted takes priority over hold be- ing asserted. therefore, th is exited in reponse to the reset input being asserted. if reset is assert- ed while hold remains asserted, the military intel386 processor drives its pins to defined states during reset, as in table 5-3 pin state during re- set , and performs internal reset activity as usual. if hold remains asserted when reset is negated, the m80386 enters the hold acknowledge state be- fore performing its first bus cycle, provided hold is still asserted when the military intel386 processor would otherwise perform its first bus cycle. if hold remains asserted when reset is negated, the busy input is still sampled as usual to determine whether a self test is being requested, and error is still sampled as usual to determine whether an m387 npx vs. an m80287 (or none) is present. 4.5.3 bus activity during and following reset reset is the highest priority input signal, capable of interrupting any processor activity when it is assert- ed. a bus cycle in progress can be aborted at any stage, or idle states or bus hold acknowledge states discontinued so that the reset state is established. reset should remain asserted for at least 15 clk2 periods to ensure it is recognized throughout the mil- itary intel386 processor, and at least 80 clk2 peri- ods if military intel386 processor self-test is going to be requested at the falling edge. reset asserted pulses less than 15 clk2 periods may not be recog- nized. reset pulses less than 80 clk2 periods fol- lowed by a self-test may cause the self-test to report a failure when no true failure exists. the additional reset pulse width is required to clear additional state prior to a valid self-test. 91
military intel386 tm microprocessor 271052 30 note: hold is a synchronous input and can be asserted at any clk2 edge, provided setup and hold (t 23 and t 24 ) require- ments are met. this waveform is useful for determining hold acknowledge latency. figure 5-26. requesting hold from active bus (na negated) provided the reset falling edge meets setup and hold times t 25 and t 26 , the internal processor clock phase is defined at that time, as illustrated by figure 5-28 and figure 7-7. a military intel386 processor self-test may be re- quested at the time reset is negated by having the busy input at a low level, as shown in figure 5-28. the self-test requires (2 20 ) a approximately 60 clk2 periods to complete. the self-test duration is not affected by the test results. even if the self-test indicates a problem, the military intel386 processor attempts to proceed with the reset sequence after- wards. after the reset falling edge (and after the self-test if it was requested) the military intel386 processor performs an internal initialization sequence for ap- proximately 350 to 450 clk2 periods. also during the initialization, between the 20th clk2 period and the first bus cycle, the error input is sampled to determine the presence of a military i387 coproces- sor versus the presence of an m80287 (or no co- processor). to distinguish between an m80287 be- ing present and no coprocessor being present re- quires a software test. 92
military intel386 tm microprocessor 271052 31 note: hold is a synchronous input and can be asserted at any clk2 edge, provided setup and hold (t 23 and t 24 ) require- ments are met. this waveform is useful for determining hold acknowledge latency. figure 5-27. requesting hold from active bus (na asserted) 4.6 self-test signature upon completion of self-test, (if self-test was re- quested by holding busy low at least eight clk2 periods before and after the falling edge of reset), the eax register will contain a signature of 00000000h indicating the military intel386 processor passed its self-test of microcode and major pla contents with no problems detected. the passing signature in eax, 00000000h, applies to all military intel386 processor revision levels. any non-zero sig- nature indicates the military intel386 processor unit is faulty. 4.7 component and revision identifiers to assist military intel386 processor users, the mili- tary intel386 processor after reset holds a compo- nent identifier and a revision identifier in its dx regis- ter. the upper 8 bits of dx hold 03h as identification of the military intel386 component. the lower 8 bits of dx hold an 8-bit unsigned binary number related to the component revision level. the revision identifi- er begins chronologically with a value zero and is subject to change (typically it will be incremented) with component steppings intended to have certain improvements or distinctions from previous step- pings. these features are intended to assist military intel386 microprocessor users to a practical extent. however, the revision identifier value is not guaran- teed to change with every stepping revision, or to follow a completely uniform numerical sequence, de- pending on the type or intention of revision, or man- ufacturing materials required to be changed. intel has sole discretion over these characteristics of the component. 93
military intel386 tm microprocessor 271052 32 notes: 1. busy should be held stable for 8 clk2 periods before and after the clk2 period in which reset falling edge occurs. 2. if self-test is requested the military intel386 tm processor outputs remain in their reset state, as shown here and in table 5-3. figure 5-28. bus activity from reset until first code fetch 94
military intel386 tm microprocessor 4.8 coprocessor interfacing the military intel386 processor provides an auto- matic interface for the intel i387 numeric floating- point coprocessor. the i387 coprocessor uses an i/o-mapped interface driven automatically by the military intel386 processor and assisted by three dedicated signals: busy , error , and pereq. as the military intel386 processor begins supporting a coprocessor instruction, it tests the busy and error signals to determine if the coprocessor can accept its next instruction. thus, the busy and error inputs eliminate the need for any ``pream- ble'' bus cycles for communication between proces- sor and coprocessor. the military i387 npx can be given its command opcode immediately. the dedi- cated signals provide instruction synchronization, and eliminate the need of using the military intel386 processor wait opcode (9bh) for military i387 npx instruction synchronization (the wait opcode was required when m8086 or m8088 was used with the m8087 coprocessor). custom coprocessors can be included in systems based on the military intel386 processor, via memo- ry-mapped or i/o-mapped interfaces. such coproc- essor interfaces allow a completely custom protocol, and are not limited to a set of coprocessor protocol ``primitives''. instead, memory-mapped or i/o- mapped interfaces may use all applicable military in- tel386 processor instructions for high-speed coproc- essor communication. the busy and error in- puts of the military intel386 processor may also be used for the custom coprocessor interface, if such hardware assist is desired. these signals can be tested by the military intel386 processor wait op- code (9bh). the wait instruction will wait until the busy input is negated (interruptable by an nmi or enabled intr input), but generates an exception 16 fault if the error pin is in the asserted state when the busy goes (or is) negated. if the custom co- processor interface is memory-mapped, protection of the addresses used for the interface can be pro- vided with the military intel386 processor on-chip paging or segmentation mechanisms. if the custom interface is i/o-mapped, protection of the interface can be provided with the military intel386 processor iopl (i/o privilege level) mechanism. the military i387 numeric coprocessor interface is i/o mapped as shown in table 5-10. note that the military i387 coprocessor interface addresses are beyond the 0h-ffffh range for programmed i/o. when the military intel386 processor supports the military i387 coprocessor, the military intel386 proc- essor automatically generates bus cycles to the co- processor interface addresses. table 5-10. numeric coprocessor port addresses address in m387 military intel386 processor coprocessor i/o space register 800000f8h opcode register (32-bit port) 800000fch operand register (32-bit port) to correctly map the military i387 npx registers to the appropriate i/o addresses, connect the military i387 npx cmd0 pin directly to the a2 output of the military intel386 processor. 4.8.1 software testing for coprocessor presence when software is used to test for coprocessor (m387 npx) presence, it should use only the follow- ing coprocessor opcodes: finit, fninit, fstcw mem, fstsw mem, fstsw ax. to use other co- processor opcodes when a coprocessor is known to be not present, first set em e 1 in military intel386 processor cr0. 95
military intel386 tm microprocessor 5.0 mechanical data 5.1 introduction in this section, the physical packaging and its con- nections are described in detail. 5.2 pin assignment the military intel386 processor pinout as viewed from the top side of the pga component is shown by figure 6-1. its pinout as viewed from the pin side of the component is figure 6-2. the military intel386 processor pinout for the cqfp is shown in figure 6-3. v cc and gnd connections must be made to multi- ple v cc and v ss (gnd) pins. each v cc and v ss must be connected to the appropriate voltage level. the circuit board should include v cc and gnd planes for power distribution and all v cc and v ss pins must be connected to the appropriate plane. note: pins identified as ``n.c.'' should remain completely unconnected. 271052 33 note: nc pins should always remain unconnected. figure 6-1. military intel386 tm processor pga pinouteview from top side 96
military intel386 tm microprocessor 271052 34 note: nc pins should always remain unconnected. figure 6-2. military intel386 tm processor pga pinouteview from pin side 97
military intel386 tm microprocessor table 6-1. military intel386 tm processor pga pinoutefunctional grouping pin signal n2 a31 p1 a30 m2 a29 l3 a28 n1 a27 m1 a26 k3 a25 l2 a24 l1 a23 k2 a22 k1 a21 j1 a20 h3 a19 h2 a18 h1 a17 g1 a16 f1 a15 e1 a14 e2 a13 e3 a12 d1 a11 d2 a10 d3 a9 c1 a8 c2 a7 c3 a6 b2 a5 b3 a4 a3 a3 c4 a2 a13 be3 b13 be2 c13 be1 e12 be0 c9 reset pin signal m5 d31 p3 d30 p4 d29 m6 d28 n5 d27 p5 d26 n6 d25 p7 d24 n8 d23 p9 d22 n9 d21 m9 d20 p10 d19 p11 d18 n10 d17 n11 d16 m11 d15 p12 d14 p13 d13 n12 d12 n13 d11 m12 d10 n14 d9 l13 d8 k12 d7 l14 d6 k13 d5 k14 d4 j14 d3 h14 d2 h13 d1 h12 d0 d14 hold m14 hlda pin signal a1 v cc a5 v cc a7 v cc a10 v cc a14 v cc c5 v cc c12 v cc d12 v cc g2 v cc g3 v cc g12 v cc g14 v cc l12 v cc m3 v cc m7 v cc m13 v cc n4 v cc n7 v cc p2 v cc p8 v cc f12 clk2 e14 ads b10 w/r a11 d/c a12 m/io c10 lock d13 na g13 ready c14 bs16 b7 intr pin signal a2 v ss a6 v ss a9 v ss b1 v ss b5 v ss b11 v ss b14 v ss c11 v ss f2 v ss f3 v ss f14 v ss j2 v ss j3 v ss j12 v ss j13 v ss m4 v ss m8 v ss m10 v ss n3 v ss p6 v ss p14 v ss a4 n.c. b4 n.c. b6 n.c. b12 n.c. c6 n.c. c7 n.c. e13 n.c. f13 n.c. c8 pereq b9 busy a8 error b8 nmi 98
military intel386 tm microprocessor 271052 35 (staggered pin arrangement is shown for clarity only. actual package has pins of equal length.) note: nc pins should always remain unconnected. figure 6-3. military intel386 tm processor cqfp pinouteview from top side 99
military intel386 tm microprocessor table 6-2. military intel386 tm microprocessor cqfp pin cross-reference pin signal 1 a16 2 a17 3 a15 4 a14 5 a13 6 a12 7 a11 8nc 9a9 10 a10 11 v ss 12 a7 13 a8 14 v cc 15 v ss 16 a5 17 a6 18 nc 19 nc 20 a3 21 a4 22 nc 23 nc 24 a2 25 nc 26 v cc 27 v ss 28 nc 29 nc 30 nc 31 v cc 32 v ss 33 nc 34 v cc 35 v ss 36 nc 37 intr 38 nc 39 nc 40 nc 41 nmi pin signal 42 pereq 43 v ss 44 v cc 45 error 46 reset 47 nc 48 nc 49 busy 50 v cc 51 v ss 52 v cc 53 v ss 54 v cc 55 v ss 56 v cc 57 v ss 58 v cc 59 lock 60 w/r 61 v ss 62 v cc 63 m/io 64 d/c 65 nc 66 nc 67 be2 68 nc 69 v cc 70 v ss 71 be0 72 be3 73 nc 74 be1 75 na 76 nc 77 nc 78 bs16 79 hold 80 clk2 81 ads 82 nc pin signal 83 ready 84 nc 85 d1 86 nc 87 d2 88 nc 89 d4 90 d0 91 v cc 92 v ss 93 d6 94 d3 95 v ss 96 v cc 97 d8 98 d5 99 v cc 100 v ss 101 hlda 102 d7 103 nc 104 d10 105 d9 106 nc 107 nc 108 d12 109 d11 110 v cc 111 v ss 112 d14 113 d13 114 nc 115 d15 116 d16 117 v cc 118 v ss 119 d18 120 d17 121 d20 122 v ss 123 d22 pin signal 124 d19 125 nc 126 nc 127 d24 128 d21 129 nc 130 nc 131 d25 132 d23 133 nc 134 nc 135 d27 136 d26 137 v cc 138 v ss 139 d29 140 d28 141 v cc 142 v ss 143 d31 144 d30 145 nc 146 a31 147 a30 148 a29 149 a28 150 v cc 151 v ss 152 a27 153 a26 154 a25 155 nc 156 a23 157 a24 158 a21 159 a22 160 nc 161 a20 162 a19 163 a18 164 nc 100
military intel386 tm microprocessor table 6-3. military intel386 tm processor pga package thermal characteristics thermal resistance e c/watt airflow e ft./min (m/sec) parameter 0 50 100 200 400 600 800 (0) (0.25) (0.50) (1.01) (2.03) (3.04) (4.06) i junction-to-case 2 2 2 2 2 2 2 (case measured as fig. 6-4) i case-to-ambient 19 18 17 15 12 10 9 (no heatsink) notes: 1. table 6-3 applies to military intel386 pga plugged into socket or soldered di- rectly into board. 2. i ja e i jc a i ca . 3. i j-cap e 4 c/w (approx.) i j-pin e 4 c/w (inner pins) (approx.) i j-pin e 8 c/w (outer pins) (approx.) 271052 72 101
military intel386 tm microprocessor 6.0 electrical data 6.1 introduction the following sections describe recommended elec- trical connections for the military intel386 processor, and its electrical specifications. 6.2 power and grounding 6.2.1 power connections the military intel386 processor is implemented in chmos iii technology and has modest power re- quirements. however, its high clock frequency and 72 output buffers (address, data, control, and hlda) can cause power surges as multiple output buffers drive new signal levels simultaneously. for clean on- chip power distribution at high frequency, 20 v cc and 21 v ss pins separately feed functional units of the military intel386 processor. power and ground connections must be made to all external v cc and gnd pins of the military intel386 processor. on the circuit board, all v cc pins must be connected on a v cc plane. all v ss pins must be likewise connected on a gnd plane. 6.2.2 power decoupling recommendations liberal decoupling capacitance should be placed near the military intel386 processor. the military intel386 processor driving its 32-bit parallel address and data buses at high frequencies can cause tran- sient power surges, particularly when driving large capacitive loads. low inductance capacitors and interconnects are recommended for best high frequency electrical per- formance. inductance can be reduced by shortening circuit board traces between the military intel386 processor and decoupling capacitors as much as possible. capacitors specifically for pga packages are also commercially available, for the lowest possible in- ductance. 6.2.3 resistor recommendations the error and busy inputs have resistor pullups of approximately 20 k x built-in to the military intel386 processor to keep these signals negated when neither m80287 or military i387 npx are pres- ent in the system (or temporarily removed from its socket). the bs16 input also has an internal pullup resistor of approximately 20 k x , and the pereq input has an internal pulldown resistor of approxi- mately 20 k x . in typical designs, the external pullup resistors shown in table 7-1 are recommended. however, a particular design may have reason to adjust the re- sistor values recommended here, or alter the use of pullup resistors in other ways. 6.2.4 other connection recommendations for reliable operation, always connect unused in- puts to an appropriate signal level. n.c. pins should always remain unconnected. particularly when not using interrupts or bus hold, (as when first prototyping, perhaps) prevent any chance of spurious activity by connecting these as- sociated inputs to gnd: pin signal b7 intr b8 nmi d14 hold if not using address pipelining, pullup d13 na to v cc . if not using 16-bit bus size, pullup c14 bs16 to v cc . pullups in the range of 20 k x are recommended. table 7-1. recommended resistor pullups to v cc pin and signal pullup value purpose e14 ads 20 k x g 10% lightly pull ads negated during military intel386 processor hold acknowledge states c10 lock 20 k x g 10% lightly pull lock negated during military intel386 processor hold acknowledge states 102
military intel386 tm microprocessor 6.3 maximum ratings table 7-2. maximum ratings military intel386 parameter processor maximum rating storage temperature b 65 cto a 150 c case temperature under bias b 55 cto a 125 c supply voltage with respect to v ss b 0.5v to a 6.5v voltage on other pins b 0.5v to v cc a 0.5v table 7-2 is a stress rating only, and functional oper- ation at the maximums is not guaranteed. functional operating conditions are given in 7.4 dc specifica- tions and 7.5 ac specifications . extended exposure to the maximum ratings may af- fect device reliability. furthermore, although the mili- tary intel386 processor contains protective circuitry to resist damage from static electric discharge, al- ways take precautions to avoid high static voltages or electric fields. 6.4 operating conditions mil-std-883 symbol description min max units t c case temperature (instant on) b 55 a 125 c v cc digital supply voltage 4.75 5.25 v extended temperature symbol description min max units t c case temperature (instant on) b 40 a 110 c v cc digital supply voltage 4.75 5.25 v military temperature only (mto) symbol description min max units t c case temperature (instant on) b 55 a 125 c v cc digital supply voltage 4.75 5.25 v 103
military intel386 tm microprocessor 6.5 dc specifications (over specified operating conditions) table 7-3. military intel386 tm processor dc characteristics symbol parameter min max unit notes v il input low voltage b 0.3 0.8 v v ih input high voltage 2.0 v cc a 0.3 v v ilc clk2 input low voltage b 0.3 0.8 v v ihc clk2 input high voltage v cc b 0.8 v cc a 0.3 v v ol output low voltage i ol e 4 ma: a2 a31, d0 d31 0.45 v i ol e 5 ma: be0 be3 , w/r , 0.45 v d/c , m/io , lock , ads , hlda v oh output high voltage i oh eb 1 ma: a2 a31, d0 d31 2.4 v i oh eb 0.9 ma: be0 be3 , w/r , 2.4 v d/c , m/io , lock , ads , hlda i li input leakage current (for all pins g 15 m a0v s v in s v cc except bs16 , pereq, busy , and error ) i ih input leakage current (pereq pin) 200 m av ih e 2.4v (note 1) i il input leakage current b 400 m av il e 0.45v (note 2) (bs16 , busy , and error pins) i lo output leakage current g 15 m a 0.45v s v out s v cc i cc supply current clk2 e 32 mhz: with 16 mhz 460 ma i cc typ. e 370 ma military intel386 processor clk2 e 40 mhz: with 20 mhz 550 ma i cc typ. e 460 ma military intel386 processor clk2 e 50 mhz; with 25 mhz 680 ma i cc typ. e 580 ma military intel386 processor c in input capacitance 20 pf f c e 1 mhz c out output or i/o capacitance 25 pf f c e 1 mhz c clk clk2 capacitance 20 pf f c e 1 mhz notes: 1. pereq input has an internal pulldown resistor. 2. bs16 , busy and error inputs each have an internal pullup resistor. 104
military intel386 tm microprocessor 6.6 ac specifications 6.6.1 ac specification definitions the ac specifications, given in tables 7-4 and 7-5 consist of output delays, input setup requirements and input hold requirements. all ac specifications are relative to the clk2 rising edge crossing the 2.0v level. ac spec measurement for a 12 mhz military intel386 processor is defined by figure 7-1. inputs must be driven to the voltage levels indicated by fig- ure 7-1 when ac specifications are measured. mili- tary intel386 processor output delays are specified with minimum and maximum limits, measured as shown. the minimum military intel386 processor de- lay times are hold times provided to external circuit- ry. military intel386 processor input setup and hold times are specified as minimums, defining the small- est acceptable sampling window. within the sam- pling window, a synchronous input signal must be stable for correct military intel386 processor opera- tion. outputs na , w/r , d/c , m/io , lock , be0 be3 , a2 a31 and hlda only change at the beginning of phase one. d0 d31 (write cycles) only change at the beginning of phase two. the ready , hold, busy , error , pereq and d0 d31 (read cycles) inputs are sampled at the beginning of phase one. the na , bs16 , intr and nmi inputs are sampled at the beginning of phase two. 271052 37 figure 7-1. drive levels and measurement points for 12 mhz military intel386 tm processor ac specifications 105
military intel386 tm microprocessor 6.6.2 ac specification tables (over specified operating conditions) output trip level: 1.5v table 7-4. military intel386 tm processor ac characteristics symbol parameter 16 mhz 20 mhz 25 mhz unit figure ref. notes military military military intel386 intel386 intel386 processor processor processor min max min max min max operating frequency 4 16 4 20 mhz e half of clk2 frequency t 1 clk2 period 31 125 25 125 20 125 ns 7-3 t 2a clk2 high time 9 8 7 ns 7-3 at 2v t 2b clk2 high time 5 5 4 ns 7-3 at (v cc b 0.8v) t 3a clk2 low time 9 8 7 ns 7-3 at 2v t 3b clk2 low time 7 6 5 ns 7-3 at 0.8v t 4 clk2 fall time 8 8 7 ns 7-3 (v cc b 0.8v) to 0.8v t 5 clk2 rise time 8 8 7 ns 7-3 0.8v to (v cc b 0.8v) t 6 a2a31 valid delay 4 36 4 27 4 20 ns 7-5 c l e 120 pf * t 7 a2a31 float delay 4 40 4 32 4 30 ns 7-6 (note 1) t 8 be0 be3 valid delay 4 36 4 27 4 24 ns 7-5 c l e 75 pf * t 9 be0 be3 , lock 4 40 4 32 4 30 ns 7-6 (note 1) float delay t 10 w/r , m/io , d/c , 633628419ns7-5c l e 75 pf * ads valid delay t 11 w/r , m/io , d/c , 6 35 6 30 4 30 ns 7-6 (note 1) ads float delay t 12 d0d31 write data 4 48 6 38 8 27 ns 7-5 c l e 120 pf * valid delay t 13 d0d31 write data 4 35 4 27 4 22 ns 7-6 (note 1) float delay t 14 hlda valid delay 6 33 6 28 4 22 ns 7-6 c l e 75 pf * t 15 na setup time 11 9 7 ns 7-4 t 16 na hold time 14 14 3 ns 7-4 t 17 bs16 setup time 13 13 7 ns 7-4 t 18 bs16 hold time 21 21 3 ns 7-4 t 19 ready setup time 21 12 8 ns 7-4 t 20 ready hold time 4 4 4 ns 7-4 t 21 d0d31 read 11 11 6 ns 7-4 setup time * c l e 50 pf for 25 mhz. 106
military intel386 tm microprocessor 6.6.2 ac specification tables (over specified operating conditions) (continued) output trip level: 1.5v table 7-4. military intel386 processor ac characteristics (continued) symbol parameter 16 mhz 20 mhz 25 mhz unit figure ref. notes military military military intel386 intel386 intel386 processor processor processor min max min max min max t 22 d0d31 read 6 6 5 ns 7-4 hold time t 23 hold setup time 26 17 15 ns 7-4 t 24 hold hold time 5 5 3 ns 7-4 t 25 reset setup time 13 12 10 ns 7-7 t 26 reset hold time 4 4 3 ns 7-7 t 27 nmi, intr setup time 16 16 6 ns 7-4 (note 2) t 28 nmi, intr hold time 16 16 6 ns 7-4 (note 2) t 29 pereq, error , busy 16 14 6 ns 7-4 (note 2) setup time t 30 pereq, error , busy 5 5 5 ns 7-4 (note 2) hold time notes: 1. float condition occurs when maximum output current becomes less than i lo in magnitude. 2. these inputs are allowed to be asynchronous to clk2. the setup and hold specifications are given for testing purposes, to assure recognition within a specific clk2 period. 6.6.3 ac test loads 271052 38 c l e 120 pf * on a2 a31, d0 d31 c l e 75 pf * on be0 be3 , w/r , m/io , d/c , ads , lock , hlda c l includes all parasitic capacitances. * c l e 50 pf for 25 mhz. 6.6.4 ac timing waveforms 271052 39 figure 7-2. ac test load figure 7-3. clk2 timing 107
military intel386 tm microprocessor 271052 40 figure 7-4. input setup and hold timing 271052 41 figure 7-5. output valid delay timing 108
military intel386 tm microprocessor 271052 42 figure 7-6. output float delay and hlda valid delay timing 271052 43 the second internal processor phase following reset high-to-low transition (provided t 25 and t 26 are met) is w 2. figure 7-7. reset setup and hold timing, and internal phase 109
military intel386 tm microprocessor 6.7 designing for ice tm -386 use the military intel386 processor in-circuit emulator product is ice tm -386. because of the high operating frequency of military intel386 processor systems and ice-386, there is no cable separating the ice- 386 probe module from the target system. the ice- 386 probe module has several electrical and me- chanical characteristics that should be taken into consideration when designing the hardware. capacitive loading : ice-386 adds up to 25 pf to each line. drive requirement : ice-386 adds one standard ttl load on the clk2 line, up to one advanced low- power schottky ttl load per control signal line, and one advanced low-power schottky ttl load per ad- dress, byte enable, and data line. these loads are within the probe module and are driven by the probe's m80386, which has standard drive and load- ing capability listed in tables 7-3 and 7-4. power requirement : for noise immunity the ice-386 probe is powered by the user system. the high-speed probe circuitry draws up to 0.7a plus the maximum military intel386 processor i cc from the user military intel386 processor socket. military intel386 processor location and orienta- tion : the ice-386 processor module (pm), and the optional isolation board (oib) used for extra electri- cal buffering of the ice initially, require clearance as illustrated in figures 7-8 and 7-9, respectively. fig- ures 7-8 and 7-9 also illustrate the via holes in these modules for recommended orientation of a screw- actuated zif socket. figure 7-10 illustrates the rec- ommended orientation for a lever-actuated zif socket. ready drive : the ice-386 system may be able to clear a user system ready hang if the user's ready driver is implemented with an open-collector or tri-state device. optional interface board (oib) and clk2 speed reduction : when the ice-386 processor probe is first attached to an unverified user system, the oib helps ice-386 function in user systems with bus faults (shorted signals, etc.). after electrical verifica- tion it may be removed. only when the oib is in- stalled, the user system must have a reduced clk2 frequency of 16 mhz maximum. cache coherence : ice-386 loads user memory by performing military intel386 processor write cycles. note that if the user system is not designed to up- date or invalidate its cache (if it has a cache) upon processor writes to memory, the cache could con- tain stale instruction code and/or data. for best use of ice-386, the user should consider designing the cache (if any) to update itself automatically when processor writes occur, or find another method of maintaining cache data coherence with main user memory. 271052 75 figure 7-8. ice tm -386 processor module clearance requirements (inches) 110
military intel386 tm microprocessor 271052 76 figure 7-9. ice tm -386 optional interface module clearance requirements (inches) 271052 74 figure 7-10. recommended orientation of lever-actuated zif socket for ice tm -386 use 111
military intel386 tm microprocessor 7.0 instruction set this section describes the military intel386 proces- sor instruction set. a table lists all instructions along with instruction encoding diagrams and clock counts. further details of the instruction encoding are then provided in the following sections, which completely describe the encoding structure and the definition of all fields occurring within military intel386 processor instructions. 7.1 military intel386 tm processor instruction encoding and clock count summary to calculate elapsed time for an instruction, multiply the instruction clock count, as listed in table 8-1 below, by the processor clock period (e.g. 62.5 ns for a military intel386 processor operating at 16 mhz (32 mhz clk2 signal)). the actual clock count of a military intel386 processor program will average 5% more than the calculated clock count due to instruc- tion sequences which execute faster than they can be fetched from memory. for more detailed information on the encodings of instructions refer to section 7.2 instruction encod- ings. section 7.2 explains the general structure of instruction encodings, and defines exactly the en- codings of all fields contained within the instruction. instruction clock count assumptions 1. the instruction has been prefetched, decoded, and is ready for execution. 2. bus cycles do not require wait states. 3. there are no local bus hold requests delaying processor access to the bus. 4. no exceptions are detected during instruction ex- ecution. 5. if an effective address is calculated, it does not use two general register components. one regis- ter, scaling and displacement can be used within the clock counts shown. however, if the effective address calculation uses two general register components, add 1 clock to the clock count shown. instruction clock count notation 1. if two clock counts are given, the smaller refers to a register operand and the larger refers to a mem- ory operand. 2. n e number of times repeated. 3. m e number of components in the next instruc- tion executed, where the entire displacement (if any) counts as one component, the entire imme- diate data (if any) counts as one component, and all other bytes of the instruction and prefix(es) each count as one component. wait states add 1 clock per wait state to instruction execution for each data access. 112
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode general data transfer mov e move: register to register/memory 1000100w modreg r/m 2/2 2/2 b h register/memory to register 1000101w modreg r/m 2/4 2/4 b h immediate to register/memory 1100011w mod000 r/m immediate data 2/2 2/2 b h immediate to register (short form) 1011 w reg immediate data 2 2 memory to accumulator (short form) 1010000w full displacement 44bh accumulator to memory (short form) 1010001w full displacement 22bh register memory to segment register 10001110 mod sreg3 r/m 2/5 18/19 b h, i, j segment register to register/memory 10001100 mod sreg3 r/m 2/2 2/2 b h movsx e move with sign extension register from register/memory 00001111 1011111w modreg r/m 3/6 3/6 b h movzx e move with zero extension register from register/memory 00001111 1011011w modreg r/m 3/6 3/6 b h push e push: register/memory 11111111 mod110 r/m 5 5 b h register (short form) 01010 reg 2 2 b h segment register (es, cs, ss or ds) 0 0 0 sreg 2110 2 2 b h segment register (fs or gs) 00001111 10 sreg 3000 2 2 b h immediate 011010s0 immediate data 2 2 b h pusha e push all 01100000 18 18 b h pop e pop register/memory 10001111 mod000 r/m 5 5 b h register (short form) 01011 reg 4 4 b h segment register (es, ss or ds) 000sreg2111 7 21 b h,i,j segment register (fs or gs) 00001111 10 sreg3001 7 21 b h,i,j popa e pop all 01100001 24 24 b h xchg e exchange register/memory with register 1000011w modreg r/m 3/5 3/5 b,f f,h register with accumulator (short form) 10010 reg 8086 mode clk count virtual 33 in e input from: fixed port 1110010w port number 2 26 12 6 * /26 ** m variable port 1110110w 2 27 13 7 * /27 ** m out e output to: fixed port 1110011w port number 2 24 10 4 * /24 ** m variable port 1110111w 2 25 11 5 * /25 ** m lea e load ea to register 10001101 modreg r/m 2 2 * if cpl s iopl ** if cpl l iopl 113
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode segment control lds e load pointer to ds 11000101 modreg r/m 7 22 b h,i,j les e load pointer to es 11000100 modreg r/m 7 22 b h,i,j lfs e load pointer to fs 00001111 10110100 modreg r/m 7 25 b h,i,j lgs e load pointer to gs 00001111 10110101 modreg r/m 7 25 b h,i,j lss e load pointer to ss 00001111 10110010 modreg r/m 7 22 b h,i,j flag control clc e clear carry flag 11111000 2 2 cld e clear direction flag 11111100 2 2 cli e clear interrupt enable flag 11111010 8 8 m clts e clear task switched flag 00001111 00000110 6 6 c l cmc e complement carry flag 11110101 2 2 lahf e load ah into flag 10011111 2 2 popf e pop flags 10011101 5 5 b h,n pushf e push flags 10011100 4 4 b h sahf e store ah into flags 10011110 3 3 stc e set carry flag 11111001 2 2 std e set direction flag 11111101 2 2 sti e set interrupt enable flag 11111011 8 8 m arithmetic add e add register to register 000000dw modreg r/m 2 2 register to memory 0000000w modreg r/m 7 7 b h memory to register 0000001w modreg r/m 6 6 b h immediate to register/memory 100000sw mod000 r/m immediate data 2/7 2/7 b h immediate to accumulator (short form) 0000010w immediate data 2 2 adc e add with carry register to register 000100dw modreg r/m 2 2 register to memory 0001000w modreg r/m 7 7 b h memory to register 0001001w modreg r/m 6 6 b h immediate to register/memory 100000sw mod010 r/m immediate data 2/7 2/7 b h immediate to accumulator (short form) 0001010w immediate data 2 2 inc e increment register/memory 1111111w mod000 r/m 2/6 2/6 b h register (short form) 01000 reg 2 2 sub e subtract register from register 001010dw modreg r/m 2 2 114
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode arithmetic (continued) register from memory 0010100w modreg r/m 7 7 b h memory from register 0010101w modreg r/m 6 6 b h immediate from register/memory 100000sw mod101 r/m immediate data 2/7 2/7 b h immediate from accumulator (short form) 0010110w immediate data 2 2 sbb e subtract with borrow register from register 000110dw modreg r/m 2 2 register from memory 0001100w modreg r/m 7 7 b h memory from register 0001101w modreg r/m 6 6 b h immediate from register/memory 100000sw mod011 r/m immediate data 2/7 2/7 b h immediate from accumulator (short form) 0001110w immediate data 2 2 dec e decrement register/memory 1111111w reg001 r/m 2/6 2/6 b h register (short form) 01001 reg 2 2 cmp e compare register with register 001110dw modreg r/m 2 2 memory with register 0011100w modreg r/m 5 5 b h register with memory 0011101w modreg r/m 6 6 b h immediate with register/memory 100000sw mod111 r/m immediate data 2/5 2/5 b h immediate with accumulator (short form) 0011110w immediate data 2 2 neg e change sign 1111011w mod011 r/m 2/6 2/6 b h aaa e ascii adjust for add 00110111 4 4 aas e ascii adjust for subtract 00111111 4 4 daa e decimal adjust for add 00100111 4 4 das e decimal adjust for subtract 00101111 4 4 mul e multiply (unsigned) accumulator with register/memory 1111011w mod100 r/m multiplier-byte 1217/15-20 1217/1520 b, d d, h -word 1225/15-28 1225/1528 b, d d, h -doubleword 1241/15-44 1241/1544 b, d d, h imul e integer multiply (signed) accumulator with register/memory 1111011w mod101 r/m multiplier-byte 1217/15-20 1217/1520 b, d d, h -word 1225/15-28 1225/1528 b, d d, h -doubleword 1241/15-44 1241/1544 b, d d, h register with register/memory 00001111 10101111 modreg r/m multiplier-byte 1217/15-20 1217/1520 b, d d, h -word 1225/15-28 1225/1528 b, d d, h -doubleword 1241/15-44 1241/1544 b, d d, h register/memory with immediate to register 011010s1 modreg r/m immediate data -word 1326/14-27 1326/1427 b, d d, h -doubleword 1342/14-43 1342/1443 b, d d, h 115
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode arithmetic (continued) div e divide (unsigned) accumulator by register/memory 1111011w mod110 r/m divisorebyte 14/17 14/17 b,e e,h eword 22/25 22/25 b,e e,h edoubleword 38/41 38/41 b,e e,h idiv e integer divide (signed) accumulator by register/memory 1111011w mod111 r/m divisorebyte 19/22 19/22 b,e e,h eword 27/30 27/30 b,e e,h edoubleword 43/46 43/46 b,e e,h aad e ascii adjust for divide 11010101 00001010 19 19 aam e ascii adjust for multiply 11010100 00001010 17 17 cbw e convert byte to word 10011000 3 3 cwd e convert word to double word 10011001 2 2 logic shift rotate instructions not through carry (rol, ror, sal, sar, shl, and shr) register/memory by 1 1101000w modttt r/m 3/7 3/7 b h register/memory by cl 1101001w modttt r/m 3/7 3/7 b h register/memory by immediate count 1100000w modttt r/m immed 8-bit data 3/7 3/7 b h through carry (rcl and rcr) register/memory by 1 1101000w modttt r/m 9/10 9/10 b h register/memory by cl 1101001w modttt r/m 9/10 9/10 b h register/memory by immediate count 1100000w modttt r/m immed 8-bit data 9/10 9/10 b h t t t instruction 000 rol 001 ror 010 rcl 011 rcr 1 0 0 shl/sal 101 shr 111 sar shld e shift left double register/memory by immediate 00001111 10100100 modreg r/m immed 8-bit data 3/7 3/7 register/memory by cl 00001111 10100101 modreg r/m 3/7 3/7 shrd e shift right double register/memory by immediate 00001111 10101100 modreg r/m immed 8-bit data 3/7 3/7 register/memory by cl 00001111 10101101 modreg r/m 3/7 3/7 and e and register to register 001000dw modreg r/m 2 2 116
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode logic (continued) register to memory 0010000w modreg r/m 7 7 b h memory to register 0010001w modreg r/m 6 6 b h immediate to register/memory 100000sw mod100 r/m immediate data 2/7 2/7 b h immediate to accumulator (short form) 0010010w immediate data 2 2 test e and function to flags, no result register/memory and register 1000010w modreg r/m 2/5 2/5 b h immediate data and register/memory 1111011w mod000 r/m immediate data 2/5 2/5 b h immediate data and accumulator (short form) 1010100w immediate data 2 2 or e or register to register 000010dw modreg r/m 2 2 register to memory 0000100w modreg r/m 7 7 b h memory to register 0000101w modreg r/m 6 6 b h immediate to register/memory 100000sw mod001 r/m immediate data 2/7 2/7 b h immediate to accumulator (short form) 0000110w immediate data 2 2 xor e exclusive or register to register 001100dw modreg r/m 2 2 register to memory 0011000w modreg r/m 77bh memory to register 0011001w modreg r/m 66bh immediate to register/memory 100000sw mod110 r/m immediate data 2/7 2/7 b h immediate to accumulator (short form) 0011010w immediate data 2 2 not e invert register/memory 1111011w mod010 r/m 2/6 2/6 b h string manipulation cmps e compare byte word 1010011w virtual count mode 8086 clk 10 10 b h ins e input byte/word from dx port 0110110w 2 29 15 9 * /29 ** bh,m lods e load byte/word to al/ax/eax 1010110w 55bh movs e move byte word 1010010w 88bh outs e output byte/word to dx port 0110111w 2 28 14 8 * /28 ** bh,m scas e scan byte word 1010111w 88bh stos e store byte/word from al/ax/ex 1010101w 55bh xlat e translate string 11010111 5 5 h repeated string manipulation repeated by count in cx or ecx repe cmps e compare string (find non-match) 11110011 1010011w 5 a 9n 5 a 9n b h * if cpl s iopl ** if cpl l iopl 117
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode repeated string manipulation (continued) repne cmps e compare string (find match) 11110010 1010011w 8086 mode clk count virtual 5 a 9n 5 a 9n b h rep ins e input string 11110010 0110110w 2 28 a 6n 14 a 6n 8 a 6n * /28 a 6n ** bh,m rep lods e load string 11110010 1010110w 5 a 6n 5 a 6n b h rep movs e move string 11110010 1010010w 8 a 4n 8 a 4n b h rep outs e output string 11110010 0110111w 2 26 a 5n 12 a 5n 6 a 5n * /26 a 5n ** bh,m repe scas e scan string (find non-al/ax/eax) 11110011 1010111w 5 a 8n 5 a 8n b h repne scas e scan string (find al/ax/eax) 11110010 1010111w 5 a 8n 5 a 8n b h rep stos e store string 11110010 1010101w 5 a 5n 5 a 5n b h bit manipulation bsf e scan bit forward 00001111 10111100 modreg r/m 11 a 3n 11 a 3n b h bsr e scan bit reverse 00001111 10111101 modreg r/m 9 a 3n 9 a 3n b h bt e test bit register/memory, immediate 00001111 10111010 mod100 r/m immed 8-bit data 3/6 3/6 b h register/memory, register 00001111 10100011 modreg r/m 3/12 3/12 b h btc e test bit and complement register/memory, immediate 00001111 10111010 mod111 r/m immed 8-bit data 6/8 6/8 b h register/memory, register 00001111 10111011 modreg r/m 6/13 6/13 b h btr e test bit and reset register/memory, immediate 00001111 10111010 mod110 r/m immed 8-bit data 6/8 6/8 b h register/memory, register 00001111 10110011 modreg r/m 6/13 6/13 b h bts e test bit and set register/memory, immediate 00001111 10111010 mod101 r/m immed 8-bit data 6/8 6/8 b h register/memory, register 00001111 10101011 modreg r/m 6/13 6/13 b h control transfer call e call direct within segment 11101000 full displacement 7 a m7 a mbr register/memory indirect within segment 11111111 mod010 r/m 10 a m 7 a m/ 10 a m 7 a m/ bh,r direct intersegment 10011010 unsigned full offset, selector 17 a m34 a m b j,k,r notes: 2 clock count shown applies if i/o permission allows i/o to the port in virtual 8086 mode. if i/o bit map denies permission exception 13 fault occurs; refer to clock counts for int 3 instruction. * if cpl s iopl ** if cpl l iopl 118
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode control transfer (continued) protected mode only (direct intersegment) via call gate to same privilege level 52 a m h,j,k,r via call gate to different privilege level, (no parameters) 86 a m h,j,k,r via call gate to different privilege level, (x parameters) 94 a 4x a m h,j,k,r from 80286 task to 80286 tss 273 h,j,k,r from 80286 task to intel386 tm dx tss 298 h,j,k,r from 80286 task to virtual 8086 task (intel386 dx tss) 218 h,j,k,r from intel386 dx task to 80286 tss 273 h,j,k,r from intel386 dx task to intel386 dx tss 300 h,j,k,r from intel386 dx task to virtual 8086 task (intel386 dx tss) 218 h,j,k,r indirect intersegment 11111111 mod011 r/m 22 a m38 a m b h,j,k,r protected mode only (indirect intersegment) via call gate to same privilege level 56 a m h,j,k,r via call gate to different privilege level, (no parameters) 90 a m h,j,k,r via call gate to different privilege level, (x parameters) 98 a 4x a m h,j,k,r from 80286 task to 80286 tss 278 h,j,k,r from 80286 task to intel386 dx tss 303 h,j,k,r from 80286 task to virtual 8086 task (intel386 dx tss) 222 h,j,k,r from intel386 dx task to 80286 tss 278 h,j,k,r from intel386 dx task to intel386 dx tss 305 h,j,k,r from intel386 dx task to virtual 8086 task (intel386 dx tss) 222 h,j,k,r jmp e unconditional jump short 11101011 8-bit displacement 7 a m7 a mr direct within segment 11101001 full displacement 7 a m7 a mr register/memory indirect within segment 11111111 mod100 r/m 10 a m 7 a m/ 10 a m 7 a m/ b h,r direct intersegment 11101010 unsigned full offset, selector 12 a m27 a m j,k,r protected mode only (direct intersegment) via call gate to same privilege level 45 a m h,j,k,r from 80286 task to 80286 tss 274 h,j,k,r from 80286 task to intel386 dx tss 301 h,j,k,r from 80286 task to virtual 8086 task (intel386 dx tss) 219 h,j,k,r from intel386 dx task to 80286 tss 270 h,j,k,r from intel386 dx task to intel386 dx tss 303 h,j,k,r from intel386 dx task to virtual 8086 task (intel386 dx tss) 221 h,j,k,r indirect intersegment 11111111 mod101 r/m 17 a m31 a m b h,j,k,r protected mode only (indirect intersegment) via call gate to same privilege level 49 a m h,j,k,r from 80286 task to 80286 tss 279 h,j,k,r from 80286 task to intel386 dx tss 306 h,j,k,r from 80286 task to virtual 8086 task (intel386 dx tss) 223 h,j,k,r from intel386 dx task to 80286 tss 275 h,j,k,r from intel386 dx task to intel386 dx tss 308 h,j,k,r from intel386 dx task to virtual 8086 task (intel386 dx tss) 225 h,j,k,r 119
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode control transfer (continued) ret e return from call: within segment 11000011 10 a m10 a m b g, h, r within segment adding immediate to sp 11000010 16-bit displ 10 a m10 a m b g, h, r intersegment 11001011 18 a m32 a m b g, h, j, k, r intersegment adding immediate to sp 11001010 16-bit displ 18 a m32 a m b g, h, j, k, r protected mode only (ret): to different privilege level intersegment 69 h, j, k, r intersegment adding immediate to sp 69 h, j, k, r conditional jumps note: times are jump ``taken or not taken'' jo e jump on overflow 8-bit displacement 01110000 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10000000 full displacement 7 a mor3 7 a mor3 r jno e jump on not overflow 8-bit displacement 01110001 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10000001 full displacement 7 a mor3 7 a mor3 r jb/jnae e jump on below/not above or equal 8-bit displacement 01110010 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10000010 full displacement 7 a mor3 7 a mor3 r jnb/jae e jump on not below/above or equal 8-bit displacement 01110011 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10000011 full displacement 7 a mor3 7 a mor3 r je/jz e jump on equal/zero 8-bit displacement 01110100 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10000100 full displacement 7 a mor3 7 a mor3 r jne/jnz e jump on not equal/not zero 8-bit displacement 01110101 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10000101 full displacement 7 a mor3 7 a mor3 r jbe/jna e jump on below or equal/not above 8-bit displacement 01110110 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10000110 full displacement 7 a mor3 7 a mor3 r jnbe/ja e jump on not below or equal/above 8-bit displacement 01110111 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10000111 full displacement 7 a mor3 7 a mor3 r js e jump on sign 8-bit displacement 01111000 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10001000 full displacement 7 a mor3 7 a mor3 r 120
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode conditional jumps (continued) jns e jump on not sign 8-bit displacement 01111001 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10001001 full displacement 7 a mor3 7 a mor3 r jp/jpe e jump on parity/parity even 8-bit displacement 01111010 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10001010 full displacement 7 a mor3 7 a mor3 r jnp/jpo e jump on not parity/parity odd 8-bit displacement 01111011 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10001011 full displacement 7 a mor3 7 a mor3 r jl/jnge e jump on less/not greater or equal 8-bit displacement 01111100 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10001100 full displacement 7 a mor3 7 a mor3 r jnl/jge e jump on not less/greater or equal 8-bit displacement 01111101 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10001101 full displacement 7 a mor3 7 a mor3 r jle/jng e jump on less or equal/not greater 8-bit displacement 01111110 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10001110 full displacement 7 a mor3 7 a mor3 r jnle/jg e jump on not less or equal/greater 8-bit displacement 01111111 8-bit displ 7 a mor3 7 a mor3 r full displacement 00001111 10001111 full displacement 7 a mor3 7 a mor3 r jcxz e jump on cx zero 11100011 8-bit displ 9 a mor5 9 a mor5 r jecxz e jump on ecx zero 11100011 8-bit displ 9 a mor5 9 a mor5 r (address size prefix differentiates jcxz from jecxz) loop e loop cx times 11100010 8-bit displ 11 a m11 a mr loopz/loope e loop with zero/equal 11100001 8-bit displ 11 a m11 a mr loopnz/loopne e loop while not zero 11100000 8-bit displ 11 a m11 a mr conditional byte set note: times are register/memory seto e set byte on overflow to register/memory 00001111 10010000 mod000 r/m 4/5 4/5 h setno e set byte on not overflow to register/memory 00001111 10010001 mod000 r/m 4/5 4/5 h setb/setnae e set byte on below/not above or equal to register/memory 00001111 10010010 mod000 r/m 4/5 4/5 h 121
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode conditional byte set (continued) setnb e set byte on not below/above or equal to register/memory 00001111 10010011 mod000 r/m 4/5 4/5 h sete/setz e set byte on equal/zero to register/memory 00001111 10010100 mod000 r/m 4/5 4/5 h setne/setnz e set byte on not equal/not zero to register/memory 00001111 10010101 mod000 r/m 4/5 4/5 h setbe/setna e set byte on below or equal/not above to register/memory 00001111 10010110 mod000 r/m 4/5 4/5 h setnbe/seta e set byte on not below or equal/above to register/memory 00001111 10010111 mod000 r/m 4/5 4/5 h sets e set byte on sign to register/memory 00001111 10011000 mod000 r/m 4/5 4/5 h setns e set byte on not sign to register/memory 00001111 10011001 mod000 r/m 4/5 4/5 h setp/setpe e set byte on parity/parity even to register/memory 00001111 10011010 mod000 r/m 4/5 4/5 h setnp/setpo e set byte on not parity/parity odd to register/memory 00001111 10011011 mod000 r/m 4/5 4/5 h setl/setnge e set byte on less/not greater or equal to register/memory 00001111 10011100 mod000 r/m 4/5 4/5 h setnl/setge e set byte on not less/greater or equal to register/memory 00001111 01111101 mod000 r/m 4/5 4/5 h setle/setng e set byte on less or equal/not greater to register/memory 00001111 10011110 mod000 r/m 4/5 4/5 h setnle/setg e set byte on not less or equal/greater to register/memory 00001111 10011111 mod000 r/m 4/5 4/5 h enter e enter procedure 11001000 16-bit displacement, 8-bit level l e 0 10 10 b h l e 1 12 12 b h l l 1 15 a 15 a bh 4(n b 1) 4(n b 1) leave e leave procedure 11001001 4 4 b h 122
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode interrupt instructions int e interrupt: type specified 11001101 type 37 b type 3 11001100 33 b into e interrupt 4 if overflow flag set 11001110 if of e 1 35 b, e if of e 0 3 3 b, e bound e interrupt 5 if detect value 01100010 modreg r/m out of range if out of range 44 b,e e,g,h,j,k,r if in range 10 10 b,e e,g,h,j,k,r protected mode only (int) int: type specified via interrupt or trap gate to same privilege level 59 g, j, k, r via interrupt or trap gate to different privilege level 99 g, j, k, r from 80286 task to 80286 tss via task gate 282 g, j, k, r from 80286 task to intel386 dx tss via task gate 309 g, j, k, r from 80286 task to virt 8086 md via task gate 226 g, j, k, r from intel386 dx task to 80286 tss via task gate 284 g, j, k, r from intel386 dx task to intel386 dx tss via task gate 311 g, j, k, r from intel386 dx task to virt 8086 md via task gate 228 g, j, k, r from virt 8086 md to 80286 tss via task gate 289 g, j, k, r from virt 8086 md to intel386 dx tss via task gate 316 g, j, k, r from virt 8086 md to priv level 0 via trap gate or interrupt gate 119 int: type 3 via interrupt or trap gate to same privilege level 59 g, j, k, r via interrupt or trap gate to different privilege level 99 g, j, k, r from 80286 task to 80286 tss via task gate 278 g, j, k, r from 80286 task to intel386 dx tss via task gate 305 g, j, k, r from 80286 task to virt 8086 md via task gate 222 g, j, k, r from intel386 dx task to 80286 tss via task gate 280 g, j, k, r from intel386 dx task to intel386 dx tss via task gate 307 g, j, k, r from intel386 dx task to virt 8086 md via task gate 224 g, j, k, r from virt 8086 md to 80286 tss via task gate 285 g, j, k, r from virt 8086 md to intel386 dx tss via task gate 312 g, j, k, r from virt 8086 md to priv level 0 via trap gate or interrupt gate 119 into: via interrupt or trap grate to same privilege level 59 g, j, k, r via interrupt or trap gate to different privilege level 99 g, j, k, r from 80286 task to 80286 tss via task gate 280 g, j, k, r from 80286 task to intel386 dx tss via task gate 307 g, j, k, r from 80286 task to virt 8086 md via task gate 224 g, j, k, r from intel386 dx task to 80286 tss via task gate 282 g, j, k, r from intel386 dx task to intel386 dx tss via task gate 309 g, j, k, r from intel386 dx gate 225 g, j, k, r from virt 8086 md to 80286 tss via task gate 287 g, j, k, r from virt 8086 md to intel386 dx tss via task gate 314 g, j, k, r from virt 8086 md to priv level 0 via trap gate or interrupt gate 119 123
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode interrupt instructions (continued) bound: via interrupt or trap gate to same privilege level 59 g, j, k, r via interrupt or trap gate to different privilege level 99 g, j, k, r from 80286 task to 80286 tss via task gate 254 g, j, k, r from 80286 task to intel386 dx tss via task gate 284 g, j, k, r from 80268 task to virt 8086 mode via task gate 231 g, j, k, r from intel386 dx task to 80286 tss via task gate 264 g, j, k, r from intel386 dx task to intel386 dx tss via task gate 294 g, j, k, r from 80368 task to virt 8086 mode via task gate 243 g, j, k, r, from virt 8086 mode to 80286 tss via task gate 264 g, j, k, r from virt 8086 mode to intel386 dx tss via task gate 294 g, j, k, r from virt 8086 md to priv level 0 via trap gate or interrupt gate 119 interrupt return iret e interrupt return 11001111 22 g,h,j,k,r protected mode only (iret) to the same privilege level (within task) 38 g, h, j, k, r to different privilege level (within task) 82 g, h, j, k, r from 80286 task to 80286 tss 232 h, j, k, r from 80286 task to intel386 dx tss 265 h, j, k, r from 80286 task to virtual 8086 task 213 h, j, k, r from 80286 task to virtual 8086 mode (within task) 60 from intel386 dx task to 80286 tss 271 h, j, k, r from intel386 dx task to intel386 dx tss 275 h, j, k, r from intel386 dx task to virtual 8086 task 223 h, j, k, r from intel386 dx task to virtual 8086 mode (within task) 60 processor control hlt e halt 11110100 5 5 l mov e move to and from control/debug/test registers cr0/cr2/cr3 from register 00001111 00100010 11eeereg 11/4/5 11/4/5 l register from cr03 00001111 00100000 11eeereg 6 6 l dr03 from register 00001111 00100011 11eeereg 22 22 l dr67 from register 00001111 00100011 11eeereg 16 16 l register from dr67 00001111 00100001 11eeereg 14 14 l register from dr03 00001111 00100001 11eeereg 22 22 l tr67 from register 00001111 00100110 11eeereg 12 12 l register from tr67 00001111 00100100 11eeereg 12 12 l nop e no operation 10010000 3 3 wait e wait until busy y pin is negated 10011011 7 7 124
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode processor extension instructions processor extension escape 11011ttt modlll r/m see h ttt and lll bits are opcode 80287/80387 information for coprocessor. data sheets for clock counts prefix bytes address size prefix 01100111 0 0 lock e bus lock prefix 11110000 0 0 m operand size prefix 01100110 0 0 segment override prefix cs: 00101110 0 0 ds: 00111110 0 0 es: 00100110 0 0 fs: 01100100 0 0 gs: 01100101 0 0 ss: 00110110 0 0 protection control arpl e adjust requested privilege level from register/memory 01100011 modreg r/m n/a 20/21 a h lar e load access rights from register/memory 00001111 00000010 modreg r/m n/a 15/16 a g, h, j, p lgdt e load global descriptor table register 00001111 00000001 mod010 r/m 11 11 b,c h,l lidt e load interrupt descriptor table register 00001111 00000001 mod011 r/m 11 11 b,c h,l lldt e load local descriptor table register to register/memory 00001111 00000000 mod010 r/m n/a 20/24 a g, h, j, l lmsw e load machine status word from register/memory 00001111 00000001 mod110 r/m 11/14 11/14 b, c h, l lsl e load segment limit from register/memory 00001111 00000011 modreg r/m byte-granular limit n/a 21/22 a g, h, j, p page-granular limit n/a 25/26 a g, h, j, p ltr e load task register from register/memory 00001111 00000000 mod011 r/m n/a 23/27 a g, h, j, l sgdt e store global descriptor table register 00001111 00000001 mod000 r/m 9 9 b,c h sidt e store interrupt descriptor table register 00001111 00000001 mod001 r/m 9 9 b,c h sldt e store local descriptor table register to register/memory 00001111 00000000 mod000 r/m n/a 2/2 a h 125
military intel386 tm microprocessor table 8-1. military intel386 tm processor instruction set clock count summary (continued) clock count notes real real instruction format address protected address protected mode or virtual mode or virtual virtual address virtual address 8086 mode 8086 mode mode mode smsw e store machine status word 00001111 00000001 mod100 r/m 2/2 2/2 b,c h,l str e store task register to register/memory 00001111 00000000 mod001 r/m n/a 2/2 a h verr e verify read accesss register/memory 00001111 00000000 mod100 r/m n/a 10/11 a g, h, j, p verw e verify write accesss 00001111 00000000 mod101 r/m n/a 15/16 a g, h, j, p instruction notes for table 8-1 notes a through c apply to intel386 dx real address mode only: a. this is a protected mode instruction. attempted execution in real mode will result in exception 6 (invalid opcode). b. exception 13 fault (general protection) will occur in real mode if an operand reference is made that partially or fully extends beyond the maximum cs, ds, es, fs or gs limit, ffffh. exception 12 fault (stack segment limit violation or not present) will occur in real mode if an operand reference is made that partially or fully extends beyond the maximum ss limit. c. this instruction may be executed in real mode. in real mode, its purpose is primarily to initialize the cpu for protected mode. notes d through g apply to intel386 dx real address mode and intel386 dx protected virtual address mode: d. the intel386 dx uses an early-out multiply algorithm. the actual number of clocks depends on the position of the most significant bit in the operand (multiplier). clock counts given are minimum to maximum. to calculate actual clocks use the following formula: actual clock e if m kl 0 then max ( [ log 2 l m l ] ,3) a b clocks: if m e 0 then 3 a b clocks in this formula, m is the multiplier, and b e 9 for register to register, b e 12 for memory to register, b e 10 for register with immediate to register, b e 11 for memory with immediate to register. e. an exception may occur, depending on the value of the operand. f. lock y is automatically asserted, regardless of the presence or absence of the lock y prefix. g. lock y is asserted during descriptor table accesses. notes h through r apply to intel386 dx protected virtual address mode only: h. exception 13 fault (general protection violation) will occur if the memory operand in cs, ds, es, fs or gs cannot be used due to either a segment limit violation or access rights violation. if a stack limit is violated, an exception 12 (stack segment limit violation or not present) occurs. i. for segment load operations, the cpl, rpl, and dpl must agree with the privilege rules to avoid an exception 13 fault (general protection violation). the segment's descriptor must indicate ``present'' or exception 11 (cs, ds, es, fs, gs not present). if the ss register is loaded and a stack segment not present is detected, an exception 12 (stack segment limit violation or not present) occurs. j. all segment descriptor accesses in the gdt or ldt made by this instruction will automatically assert lock to maintain descriptor integrity in multiprocessor systems. k. jmp, call, int, ret and iret instructions referring to another code segment will cause an exception 13 (general protection violation) if an applicable privilege rule is violated. l. an exception 13 fault occurs if cpl is greater than 0 (0 is the most privileged level). m. an exception 13 fault occurs if cpl is greater than iopl. n. the if bit of the flag register is not updated if cpl is greater than iopl. the iopl and vm fields of the flag register are updated only if cpl e 0. o. the pe bit of the msw (cr0) cannot be reset by this instruction. use mov into cr0 if desiring to reset the pe bit. p. any violation of privilege rules as applied to the selector operand does not cause a protection exception; rather, the zero flag is cleared. q. if the coprocessor's memory operand violates a segment limit or segment access rights, an exception 13 fault (general protection exception) will occur before the esc instruction is executed. an exception 12 fault (stack segment limit violation or not present) will occur if the stack limit is violated by the operand's starting address. r. the destination of a jmp, call, int, ret or iret must be in the defined limit of a code segment or an exception 13 fault (general protection violation) will occur. 126
military intel386 tm microprocessor 7.2 instruction encoding 7.2.1 overview all instruction encodings are subsets of the general instruction format shown in figure 8-1. instructions consist of one or two primary opcode bytes, possibly an address specifier consisting of the ``mod r/m'' byte and ``scaled index'' byte, a displacement if re- quired, and an immediate data field if required. within the primary opcode or opcodes, smaller en- coding fields may be defined. these fields vary ac- cording to the class of operation. the fields define such information as direction of the operation, size of the displacements, register encoding, or sign ex- tension. almost all instructions referring to an operand in memory have an addressing mode byte following the primary opcode byte(s). this byte, the mod r/m byte, specifies the address mode to be used. certain encodings of the mod r/m byte indicate a second addressing byte, the scale-index-base byte, follows the mod r/m byte to fully specify the addressing mode. addressing modes can include a displacement im- mediately following the mod r/m byte, or scaled in- dex byte. if a displacement is present, the possible sizes are 8, 16 or 32 bits. if the instruction specifies an immediate operand, the immediate operand follows any displacement bytes. the immediate operand, if specified, is always the last field of the instruction. figure 8-1 illustrates several of the fields that can appear in an instruction, such as the mod field and the r/m field, but the figure does not show all fields. several smaller fields also appear in certain instruc- tions, sometimes within the opcode bytes them- selves. table 8-2 is a complete list of all fields ap- pearing in the military intel386 processor instruction set. further ahead, following table 8-2, are detailed tables for each field. tttttttt tttttttt modtttr/m ss index base d32 l 16 l 8 l none data32 l 16 l 8 l none 7 0 7 0 765320 765320 x ? yx ? yx ? yx ? yx ? y opcode ``mod r/m'' ``s-i-b'' address immediate (one or two bytes) byte byte displacement data x ? y (t represents an (4, 2, 1 bytes (4, 2, 1 bytes opcode bit.) register and address or none) or none) mode specifier figure 8-1. general instruction format table 8-2. fields within military intel386 tm processor instructions field name description number of bits w specifies if data is byte or full size (full size is either 16 or 32 bits) 1 d specifies direction of data operation 1 s specifies if an immediate data field must be sign-extended 1 reg general register specifier 3 mod r/m address mode specifier (effective address can be a general register) 2 for mod; 3 for r/m ss scale factor for scaled index address mode 2 index general register to be used as index register 3 base general register to be used as base register 3 sreg2 segment register specifier for cs, ss, ds, es 2 sreg3 segment register specifier for cs, ss, ds, es, fs, gs 3 tttn for conditional instructions, specifies a condition asserted or a condition negated 4 note: table 8-1 shows encoding of individual instructions. 127
military intel386 tm microprocessor 7.2.2 32-bit extensions of the instruction set with the military intel386 processor, the 86/186/ 286 instruction set is extended in two orthogonal di- rections: 32-bit forms of all 16-bit instructions are added to support the 32-bit data types, and 32-bit addressing modes are made available for all instruc- tions referencing memory. this orthogonal instruc- tion set extension is accomplished having a default (d) bit in the code segment descriptor, and by hav- ing 2 prefixes to the instruction set. whether the instruction defaults to operations of 16 bits or 32 bits depends on the setting of the d bit in the code segment descriptor, which gives the de- fault length (either 32 bits or 16 bits) for both oper- ands and effective addresses when executing that code segment. in the real address mode or virtual m8086 mode, no code segment descriptors are used, bu t a d value of 0 is assumed internally by the military intel386 processor when operating in those modes (for 16-bit default sizes compatible with the m8086/m80186/m80286). two prefixes, the operand size prefix and the effec- tive address size prefix, allow overriding individually the default selection of operand size and effective address size. these prefixes may precede any op- code bytes and affect only the instruction they pre- cede. if necessary, one or both of the prefixes may be placed before the opcode bytes. the presence of the operand size prefix and the effective address prefix will toggle the operand size or the effective address size, respectively, to the value ``opposite'' from the default setting. for example, if the default operand size is for 32-bit data operations, then pres- ence of the operand size prefix toggles the instruc- tion to 16-bit data operation. as another example, if the default effective address size is 16 bits, pres- ence of the effective address size prefix toggles the instruction to use 32-bit effective address computa- tions. these 32-bit extensions are available in all military intel386 processor modes, including the real ad- dress mode or the virtual m8086 mode. in these modes the default is always 16 bits, so prefixes are needed to specify 32-bit operands or addresses. unless specified otherwise, instructions with 8-bit and 16-bit operands do not affect the contents of the high-order bits of the extended registers. 7.2.3 encoding of instruction fields within the instruction are several fields indicating register selection, addressing mode and so on. the exact encodings of these fields are defined immedi- ately ahead. 7.2.3.1 encoding of operand length (w) field for any given instruction performing a data opera- tion, the instruction is executing as a 32-bit operation or a 16-bit operation. within the constraints of the operation size, the w field encodes the operand size as either one byte or the full operation size, as shown in the table below. operand size operand size w field during 16-bit during 32-bit data operations data operations 0 8 bits 8 bits 1 16 bits 32 bits 7.2.3.2 encoding of the general register (reg) field the general register is specified by the reg field, which may appear in the primary opcode bytes, or as the reg field of the ``mod r/m'' byte, or as the r/m field of the ``mod r/m'' byte. encoding of reg field when w field is not present in instruction register selected register selected reg field during 16-bit during 32-bit data operations data operations 000 ax eax 001 cx ecx 010 dx edx 011 bx ebx 100 sp esp 101 bp ebp 110 si esi 111 di edi encoding of reg field when w field is present in instruction register specified by reg field during 16-bit data operations: reg function of w field (when w e 0) (when w e 1) 000 al ax 001 cl cx 010 dl dx 011 bl bx 100 ah sp 101 ch bp 110 dh si 111 bh di 128
military intel386 tm microprocessor register specified by reg field during 32-bit data operations reg function of w field (when w e 0) (when w e 1) 000 al eax 001 cl ecx 010 dl edx 011 bl ebx 100 ah esp 101 ch ebp 110 dh esi 111 bh edi 7.2.3.3 encoding of the segment register (sreg) field the sreg field in certain instructions is a 2-bit field allowing one of the four m80286 segment registers to be specified. the sreg field in other instructions is a 3-bit field, allowing the military intel386 processor fs and gs segment registers to be specified. 2-bit sreg2 field 2-bit segment sreg2 field register selected 00 es 01 cs 10 ss 11 ds 3-bit sreg3 field 3-bit segment sreg3 field register selected 000 es 001 cs 010 ss 011 ds 100 fs 101 gs 110 do not use 111 do not use 7.2.3.4 encoding of address mode except for special instructions, such as push or pop, where the addressing mode is pre-determined, the addressing mode for the current instruction is specified by addressing bytes following the primary opcode. the primary addressing byte is the ``mod r/m'' byte, and a second byte of addressing informa- tion, the ``s-i-b'' (scale-index-base) byte, can be specified. the s-i-b byte (scale-index-base byte) is specified when using 32-bit addressing mode and the ``mod r/m'' byte has r/m e 100 and mod e 00, 01 or 10. when the sib byte is present, the 32-bit addressing mode is a function of the mod, ss, index, and base fields. the primary addressing byte, the ``mod r/m'' byte, also contains three bits (shown as ttt in figure 8-1) sometimes used as an extension of the primary op- code. the three bits, however, may also be used as a register field (reg). when calculating an effective address, either 16-bit addressing or 32-bit addressing is used. 16-bit ad- dressing uses 16-bit address components to calcu- late the effective address while 32-bit addressing uses 32-bit address components to calculate the ef- fective address. when 16-bit addressing is used, the ``mod r/m'' byte is interpreted as a 16-bit addressing mode specifier. when 32-bit addressing is used, the ``mod r/m'' byte is interpreted as a 32-bit addressing mode specifier. tables on the following three pages define all en- codings of all 16-bit addressing modes and 32-bit addressing modes. 129
military intel386 tm microprocessor encoding of 16-bit address mode with ``mod r/m'' byte mod r/m effective address 00 000 ds: [ bx a si ] 00 001 ds: [ bx a di ] 00 010 ss: [ bp a si ] 00 011 ss: [ bp a di ] 00 100 ds: [ si ] 00 101 ds: [ di ] 00 110 ds:d16 00 111 ds: [ bx ] 01 000 ds: [ bx a si a d8 ] 01 001 ds: [ bx a di a d8 ] 01 010 ss: [ bp a si a d8 ] 01 011 ss: [ bp a di a d8 ] 01 100 ds: [ si a d8 ] 01 101 ds: [ di a d8 ] 01 110 ss: [ bp a d8 ] 01 111 ds: [ bx a d8 ] mod r/m effective address 10 000 ds: [ bx a si a d16 ] 10 001 ds: [ bx a di a d16 ] 10 010 ss: [ bp a si a d16 ] 10 011 ss: [ bp a di a d16 ] 10 100 ds: [ si a d16 ] 10 101 ds: [ di a d16 ] 10 110 ss: [ bp a d16 ] 10 111 ds: [ bx a d16 ] 11 000 registeresee below 11 001 registeresee below 11 010 registeresee below 11 011 registeresee below 11 100 registeresee below 11 101 registeresee below 11 110 registeresee below 11 111 registeresee below register specified by r/m during 16-bit data operations mod r/m function of w field (when w e 0) (when w e 1) 11 000 al ax 11 001 cl cx 11 010 dl dx 11 011 bl bx 11 100 ah sp 11 101 ch bp 11 110 dh si 11 111 bh di register specified by r/m during 32-bit data operations mod r/m function of w field (when w e 0) (when w e 1) 11 000 al eax 11 001 cl ecx 11 010 dl edx 11 011 bl ebx 11 100 ah esp 11 101 ch ebp 11 110 dh esi 11 111 bh edi 130
military intel386 tm microprocessor encoding of 32-bit address mode with ``mod r/m'' byte (no ``s-i-b'' byte present): mod r/m effective address 00 000 ds: [ eax ] 00 001 ds: [ ecx ] 00 010 ds: [ edx ] 00 011 ds: [ ebx ] 00 100 s-i-b is present 00 101 ds:d32 00 110 ds: [ esi ] 00 111 ds: [ edi ] 01 000 ds: [ eax a d8 ] 01 001 ds: [ ecx a d8 ] 01 010 ds: [ edx a d8 ] 01 011 ds: [ ebx a d8 ] 01 100 s-i-b is present 01 101 ss: [ ebp a d8 ] 01 110 ds: [ esi a d8 ] 01 111 ds: [ edi a d8 ] mod r/m effective address 10 000 ds: [ eax a d32 ] 10 001 ds: [ ecx a d32 ] 10 010 ds: [ edx a d32 ] 10 011 ds: [ ebx a d32 ] 10 100 s-i-b is present 10 101 ss: [ ebp a d32 ] 10 110 ds: [ esi a d32 ] 10 111 ds: [ edi a d32 ] 11 000 registeresee below 11 001 registeresee below 11 010 registeresee below 11 011 registeresee below 11 100 registeresee below 11 101 registeresee below 11 110 registeresee below 11 111 registeresee below register specified by reg or r/m during 16-bit data operations: mod r/m function of w field (when w e 0) (when w e 1) 11 000 al ax 11 001 cl cx 11 010 dl dx 11 011 bl bx 11 100 ah sp 11 101 ch bp 11 110 dh si 11 111 bh di register specified by reg or r/m during 32-bit data operations: mod r/m function of w field (when w e 0) (when w e 1) 11 000 al eax 11 001 cl ecx 11 010 dl edx 11 011 bl ebx 11 100 ah esp 11 101 ch ebp 11 110 dh esi 11 111 bh edi 131
military intel386 tm microprocessor encoding of 32-bit address mode (``mod r/m'' byte and ``s-i-b'' byte present): mod base effective address 00 000 ds: [ eax a (scaled index) ] 00 001 ds: [ ecx a (scaled index) ] 00 010 ds: [ edx a (scaled index) ] 00 011 ds: [ ebx a (scaled index) ] 00 100 ss: [ esp a (scaled index) ] 00 101 ds: [ d32 a (scaled index) ] 00 110 ds: [ esi a (scaled index) ] 00 111 ds: [ edi a (scaled index) ] 01 000 ds: [ eax a (scaled index) a d8 ] 01 001 ds: [ ecx a (scaled index) a d8 ] 01 010 ds: [ edx a (scaled index) a d8 ] 01 011 ds: [ ebx a (scaled index) a d8 ] 01 100 ss; [ esp a (scaled index) a d8 ] 01 101 ss: [ ebp a (scaled index) a d8 ] 01 110 ds: [ esi a (scaled index) a d8 ] 01 111 ds: [ edi a (scaled index) a d8 ] 10 000 ds: [ eax a (scaled index) a d32 ] 10 001 ds: [ ecx a (scaled index) a d32 ] 10 010 ds: [ edx a (scaled index) a d32 ] 10 011 ds: [ ebx a (scaled index) a d32 ] 10 100 ss; [ esp a (scaled index) a d32 ] 10 101 ss: [ ebp a (scaled index) a d32 ] 10 110 ds: [ esi a (scaled index) a d32 ] 10 111 ds: [ edi a (scaled index) a d32 ] note: mod field in ``mod r/m'' byte; ss, index, base fields in ``s-i-b'' byte. ss scale factor 00 x1 01 x2 10 x4 11 x8 index index register 000 eax 001 ecx 010 edx 011 ebx 100 no index reg ** 101 ebp 110 esi 111 edi ** important note: when index field is 100, indicating ``no index register,'' then ss field must equal 00. if index is 100 and ss does not equal 00, the effective address is undefined. 132
military intel386 tm microprocessor 7.2.3.5 encoding of operation direction (d) field in many two-operand instructions the d field is pres- ent to indicate which operand is considered the source and which is the destination. d direction of operation 0 register/memory k - - register ``reg'' field indicates source operand; ``mod r/m'' or ``mod ss index base'' indicates destination operand 1 register k - - register/memory ``reg'' field indicates destination operand; ``mod r/m'' or ``mod ss index base'' indicates source operand 7.2.3.6 encoding of sign-extend (s) field the s field occurs primarily to instructions with im- mediate data fields. the s field has an effect only if the size of the immediate data is 8 bits and is being placed in a 16-bit or 32-bit destination. effect on effect on s immediate data8 immediate data 16 l 32 0 none none 1 sign-extend data8 to fill none 16-bit or 32-bit destination 7.2.3.7 encoding of conditional test (tttn) field for the conditional instructions (conditional jumps and set on condition), tttn is encoded with n indicat- ing to use the condition (n e 0) or its negation (n e 1), and ttt giving the condition to test. mnemonic condition tttn o overflow 0000 no no overflow 0001 b/nae below/not above or equal 0010 nb/ae not below/above or equal 0011 e/z equal/zero 0100 ne/nz not equal/not zero 0101 be/na below or equal/not above 0110 nbe/a not below or equal/above 0111 s sign 1000 ns not sign 1001 p/pe parity/parity even 1010 np/po not parity/parity odd 1011 l/nge less than/not greater or equal 1100 nl/ge not less than/greater or equal 1101 le/ng less than or equal/greater than 1110 nle/g not less or equal/greater than 1111 7.2.3.8 encoding of control or debug or test register (eee) field for the loading and storing of the control, debug and test registers. when interpreted as control register field eee code reg name 000 cr0 010 cr2 011 cr3 do not use any other encoding when interpreted as debug register field eee code reg name 000 dr0 001 dr1 010 dr2 011 dr3 110 dr6 111 dr7 do not use any other encoding when interpreted as test register field eee code reg name 110 tr6 111 tr7 do not use any other encoding 133

▲Up To Search▲

Price & Availability of MILITARYINTEL386

	To Download MILITARYINTEL386 Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .